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The present invention relates to a method for preparing pcs::i'.i polypeptide variants by shuffling different nucleotide sequences 
of homologous DNA sequences by in vivo recombinaiicn comprising the steps of a) forming at least one circular plasm-id comprising a 
DNA sequence encoding a polypeptide, b) opening said circular piasmid(s) within the DNA sequence(s) encoding the polypeptide(s), c) 
preparing at least one DNA fragment comprising a DNA sequence homologous to a: least a part of the polypeptide coding region on at leas: 
one of the circular plasmid(s). d) introducing a: leas: one cf said opened plasmicis). together with a: least one of said homologous DNA 
fragment(s) covering full-length DNA sequences encoding said poiypepu'de(s) or parts thereof, into a recombination host cell, e) cultivating 
said recombination host cell, and 0 screening for positive polypeptide variants. 
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Despite the existence of the above methods there are still need for 
even better iterative in vivo recombination methods for preparinc 
novel positive polypeptide variants. 



SUMMARY OF THE INVENTION 

The object of the present invention is to provide an improved method 
for preparing positive polypeptide variants'" By an in vivo 
recombination method. 

The inventor of the present invention have surprisingly found that 
such positive polypeptide variants may advantageously be prepared bv 
shuffling different nucleotide se::-2:.:es of homologous DNA sequences 
by in vivo re ccmci r.a cion c:r.p::s::.: the steps cf 

ai forming at least one circular tlasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasm.::, s within the DNA sequence (s) 
encoding the polypeptide is] , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part cf the polypeptide coding region on at 
least one of the circular piasmic x s', , d) introducing at least one 
of said opened piasmic :s; , together with at least one of said 
homologous DNA fragment ( s > covering full-length DNA sequences 
encoding said polypeptide { s ; or parts thereof, into a recombination 
host ceil, 

e) cultivating said recombination host ceil, and 

f) screening for positive polypeptide variants. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the yeast express:;:, piasmic pJS026 comprising DNA 

sequence encoding the Humicoia lanuginosa lipase gene. 

Figure 2 shows the yeast expression plasmid pJS037, comprising DNA 

sequence encoding the Hunicola lanuginosa lipase gene containing 

twelve additional restriction sites. 

Figure 3 shows the plasmid pJS026. 

Figure 4 shows the plasmid pJS037, 

Figure 5 shows the in vivo recombination of the 0.9 kb synthetic 
wild-type Humicola lanuginosa lipase with pJS037 using Saccharomyces 
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cerevisiae as the recombination host ceil (described in Example 1). 
Figure 6 shows the in vivo recombination of a DNA fragment crepared 
from Humicola lanuginosa lipase variant (y) with Humicola lanuginosa 
lipase variant (d) comprised in a plasmid using Saccr.aromyces 
cerevisiae as the recombination host cell (described in Example 2) 
Figure 7 shows an overview over the location of~t"He~ inactivation 
site of the Huxicola lanuginosa lipase gene and the number of the 
clone (referred to as "blue number" .in the tables). Location of 
restriction enzyme sites and clone' numbers are relative to the 
initiation coder, cf the lipase gene. Ir. all cases a stop coder, whs 
located ir. the new reading frame 10 to 50 bp from the frameshift. 
Figure 3 shows an overview of che creator, of active humicol* 
lanuginosa lipase genes fro- the reccmcir.aticr.s ir. table 2A and H 
by a "mosaic mechanism". Lines indicate the introduction of the 
fragment sequence into the vector and l.r.es with a x indicate 
sequences that are not introduced in the active lipase colonies. 
The primers used for the ?:?. fragment are shown together with the 
location cf the frameshift mutation (marked by the 'restriction site 
used for the construction; . 

Figure S shows an overview of fragments used in the recombination 
of 2 partial overlapping fragments into a gapped vector. The 
primers used for tne ?C?. fragments are shown together with the 
location of the frameshift mutation (if not wild type!. 
Figure 10 shows an overview of fragments used in the recombination 
or 3 partial overlapping fragments into a gapped vector. The 
primers used for the PC = fragments are shown. The overlao between 
PCR353 and 355 is only a 10 bp. 

DETAILED DESCRIPTION OF THE INVENTION 

The object of the present invention is to provide an improved method 
for preparing positive polypeptide variants by an iterative in vivo 
recombination method. 

The inventor of the present invention have surprisingly found an 
efficient method for shuffling homologous DMA sequences' in an in 
vivo recombination system using a eukaryotic cell as a recombination 
host cell.. 
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A "recombination host cell" is in the context of th- preset 
invention a cell capable of mediating shuffling of a number o- 



homologous DNA sequences 



The term "shuffling" means recombination of nucleotide sequence (s) 
between two or more homologous DNA sequences resulting in output ON" 
sequences (i.e. DNA sequences having been subject** to a shuffli„e 
cycle, having a number of nucleotides exchanged, in comoarison t0 * 
10 the xnput DNA sequences (i . e> start . ng po . nt * 

sequences, . * UNn 



An important advantage of the invention is that -csaic D: 

with mul t ioie i * - ~ , , 



13 0?Sni - si -e- -s created, which Ls .,ot discovered 



-h sequences 
~ Por.oor. * s 



^ orr ' er ^P^rta-t advantage cf the creser.t i- v — ^ --„ a . . 

is . w - a t wner. 

using a fixture ; : fra:rm o n~" c~---*~ 

°~ *~~wOrs ,ir. the srreeninc 
^ set up) it elves h ; _ 

.os.^.^w _ many cifrerent clones t c 
recortoir.e pairvise c: even — 'o^-^s^ '*e ^ e 

*-s- ,as .a 4i oe seen m a ccuoie of 

examples oelow) . 



: ne - " vo record m-t^^ 0 - - _ , ^ „ _ _ • 

— — c - -* -.mention si.-ple to nerfo-r 

— ' — ^ °- ncrr.o^ogous genes or 
variants. A larce r"-ho»- '.-p-ia^-e ^- u 

d.g. .™uo_ _ ^a.ia^s.or homologous genes can b- 

rtixec m one transformation. The nixinc c ~~ ^n-n^o~ .... ■ 

^ u - — ■ ip-oveo variants or vn i h 

type genes followed by screening .creases the n^ber of furtn^ 
improved variants manyfoid compared to doing onlv rando^ 
mutagenesis. 

Recombination of multiple overlapping fragments is oossibie w<th a 
hxgn efneiency increasing the raxing of variants or homologous 
genes us.ng the in vivo recombination method. An overlap as small as 
10 bp ls sufficient for recombination which may be utilized for very 
easy -domain shuffling of even distantly related genes. 



30 



35 



The invention relates to a method for preparing polypeptide variants 
by shuffling different nucleotide sequences " of homologous DNA 
sequences by in vivo recombination comprising the steps of 
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a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid (s) within the DNA sequence (s) 
encoding the polypeptide (s) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular plasmid(s), d) introducing at least one 
of said opened plasmid (s), together with at least one of said 
homologous DNA fragment (s) covering full-length-- DNA sequences 
encoding said polypeptide (s) or parts thereof, into a recombination 
host cell, 

e) cultivating said recombination host cell, and 

f ) screening ror positive polypeptide variants . 



— ^ ..,e ir.ver.tic-. mere tr.ar, or.e cycle of step a) - 0 



:ne opening » cne plash's) step z) can be directed toward any 
site within the polypeptide :::::: region c: the plastic. The 

'° pla - id(s; —y be °P- ed b Y suitable methods known in the art. 

The opened ends of tr~ o*^s~ ; ^ — i d h_ ^ u , . , 

^~~ 5 * - - ed-_n ,ich nucleotides as 



0 



;scrioec m ?e~pon et al. ; 1999) , suora) 



is preferred not to 



-ii- m tne opened ends as it -ighc create a fra-eshift. " 

It is preferred :r open the piasr.id{s) around :ne middle of the 
polypeptide ceding :NA sequence (s), as this is believed to result in 

'agn\ent(s) and ooened 



a -ore effective recccudinacion between DNA 



plasmid (s) 



In an ertbodiment of the invention the DNA fragment (s) is (are) 

■urn or high randon 



prepared under conditions resulting in a low, rr.ed 
mutagenesis frecuencv. 



To obtain low mutagenesis frequency the DNA sequence (s) (comprising 
the DNA fragment (s) ) may be prepared by a standard PCR amplification 
method (US 4,683,202 or Saiki et al., (1989), Science 239, 487 - 
491)". 

A medium or high mutagenesis frequency may be obtained by performing 
the PCR amplification under conditions which increase the mis- 
incorporation of nucleotides, for instance as described by Deshler, 
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(1992), GATA 9(4), 103-106; Leung et al., (1989), Technique, Vol. 1, 
No. 1, 11-15. 

It is also contemplated according to the invention to combine the 
5 PCR amplification (i.e. according to this embodiment also DNA 
fragment mutation) with a mutagenesis step using a suitable physica* 
or chemical mutagenizing agent, e.g., one which induces transitions, 
transversions, inversions, scrambling, deletions, and/or insertions. 

10 In the context of the present invention the term "positive poly- 
peptide variants" means resulting polypeptide variants possessinc 
functional properties which has beer, improved in comparison to the 
polypeptides producible from the corresponding input DNA sequences. 
Examples, of such improved properties can be as different as e.g. 

15 biological activity, enzyme was hint performance, antibiotic resis- 
tance etc. 



Consequently, vmcn screening re::.:: to re usee for identifying 
positive variants depend on the cesired improved property of the 
20 polypeptide variant in question. 



If, for instance, the polypeptide question is an enzyme and the 
desired improved functional property is the wash performance, the 
screening in step f! .may conveniently be performed by use of a 
25 filter assay based on the relieving principle: 

The recombination host cell is incur ated on a suitable medium and 
under suitable conditions for the enzyme to be secreted, the medium; 
being provided with a double filter comprising a first protein- 

30 binding filter and on top of that a second filter exhibiting a low 
protein binding capability. The recombination host cell is located 
on the second filter. Subsequent to the incubation, the first filter 
comprising the enzyme secreted from the recombination host cell is 
separated from the second filter comprising said cells. The first 

35 filter is subjected to screening for the desired enzymatic activity 
and -the corresponding microbial colonies present on the second 
filter are identified. 



The filter used for binding the enzymatic activity may be any 
protein binding filter e.g. nylon or nitrocellulose. The topfilter 
carrying the colonies of the expression organism may be any filter 
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that has no or low affinity for binding proteins" e.g. cellulose 
acetate or DuraporeO. The filter may be pre-treated with any of the 
conditions to be used for screening or may be treated curing the 

detection of enzymatic activity. 

The enzymatic activity may be detected by a dye, fluorescence 
precipitation, pH indicator, IR-absorbance or any other known 
technique for detection of enzymatic activity. 



10 The detecting compound may be immobilized by any immobilizing agent 
e.g. agarose, agar, gelatine, polyacrylamide, starch, filter paper, 
cloth; or any combination of immobilizing agents. 



Ir \;; he i;npr ° Ved funcrionai P^perzy of the polypeptide is not 

surricientiy ccoc after one c v ~ - * ^- = , 

- " ~~ .ne polypeptide mav 

oe 5'jc : ,e::e: to another cvcle. 



u . . 8 - c - - sas = 3r ' e s.-.u: fling cycle is a 

7 n ■ ., - se ^ DNrt 2-agment, which may 

z U z>e tr.e wile- - vd-=> — ~ - ^ ~ o - - • , « ■ • 

-' ,: " n - * ~ r * 1 s — irimates non-essentia ' *?uta- 

ties. Non-essential stations may also be eliminated by us.no wild- 
type DNA rragments as the initially usee i.npu: DNA material. 

IC 15 " bS " he = = of the invention is suitable 



o: polypeptide, inducing enzym.es such as proteases, 
myiases, lipases, cutir.ases, amylases, ceiiuiases, peroxidases and 



oxicases . 



«lso contemplated according to the invention is polypeptides having 
30 oioiogical activity such as insulin, ACTK, glucagon, somatostatin, 
somatotropin, thymosin, parathyroid hormone, pigmentary hormones 
somatomedin, erythropoietin, luteinizing hormone, chorionic 
gonadotropin, hypothalamic releasing factors, antidiuretic hormones, 
thyroia stimulating hormone, reiaxm, interferon, thromboooietin 
(TPO) and prolactin. 

Especially contemplated according to the present invention is 
initially to use input DNA sequences being either wild-type, variant 
or modified DNA sequences, such as a DNA sequences coding for wild- 
type, variant or modified enzymes, respectively, in particular 
enzymes exhibiting lipolytic activity. 
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In an embodiment of the invention the lipolytic activity is a lipas 
activity derived from the filamentous fungi of the Humicola sp., i 
particular Humicola lanuginosa , especially Humicola lanuginosa . 

In a specific embodiment of the invention the initially used inpu 
DNA fragment to be shuffled with a homologous polypeptide is the 
wild-type DNA sequence encoding the Humicola lanuginosa lipase 
derived from Humicola lanuginosa DSM 4109 described in E? 305 216 
(Novo Nordisk A/S) . 

Also specifically encompassed by the scope of the invention is input 
DN'A sequences selected from the group of vectors (a) to (f) and/or 
DNA fragments (g) to ( aa ) coding for Humicola lanuginosa lipase 
variants fro™ the list below i r the Xacerial and Method section. 

Throughout the present application tr.e name Hurr.icola lanuginosa has 
been used "~ o i d e n t i f v or, e orererrec carent enz vrte , i.e. the one 
mentioned immedi a t e 1 v above. However, in recent years H . lanuginosa* 
has also beer, termed 7herr.orr.yces lanuginosus (a species introduced 
the first time by Tsiklinsky in 1939} since the fungus show 
morphological and physiological similarity to Thermomyces 
lanuginosus . Accordingly, it will be understood that whenever 
reference is made to r. . lanuginosa this term could be replaced by 
Thermomyces lanuginosus . The DNA encoding part of the 18S ribosomai 
gene from The rrrorrryce s lanuginosus (or H. lanuginosa ) have been 
sequenced. The resulting IBS sequence was compared to other 1 8 S 
sequences in the GenBank database and a phylogenetic analysis using 
parsimony {?AU?, Version 3.1.1, Smithsonian Institution, 1993) have 
also been made. This clearly assigns Thermomyces lanuginosus to the 
class of Pieccomycezes , probably to the order of Eurotiales . 
According to the Entrez Browser at the NC3I (National Center for 
Biotechnology Information), this relates Thermomyces lanuginosus to 
families like Erema scaceae, Monoascaceae, Pseudoeurotiaceae and 
Trichocomaceae, the latter containing genera like Emericella, 
Aspergillus, Penicillium, Eupenicillium t Paecilomyces , Talaromyces f 
Thermoascus and Sclerocleista . 

Consequently, such genes encoding lipolytic enzymes of f ilamentous 
fungi of the genera Emericella , Aspergillus, Penicillium, 
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Eupenicillium, Paecilomyces, Talaromyces, Thermoascus and 
Sclerocleista are also specifically contemplated according to the 
present invention. * the 

5 Other examples of relevant filamentous fungi oe n« 

lipolytic m lncluda strains Qf the ~ g 
-rains listed in WO 96/13578 (from Novo Nordis* whi " h ^ 

hereby incorporated by reference. Atsidia so. strains lis- h 
96/13578 include ^a t^esleeana , Jtosi^l^" " "° 
10 Absidia reflexa. "osiaia corymb-era and 

Straps of R ni ZO pus so., in particular' M . niveus and ^ 
also contemplated according to the invention. 



13 The '--Phytic gene may also b 

strain or the Pseuco-onas so. 



-rivec rrom a bacteria, s"-'- *c 



szvzzsri, p s . -ec------ =>-~ = -=s. 

- ' -----ssre.is (h- £3/04361), o- = s 

pse-oa icaii^es ' - " ^ • = 2 - i " - ' ^ C3ii ^ S5 «d ft. 

20 (c^sc-s-- - ~ r 375 ' 0r W0 5^/25578 



-^^->. ^ — - ° - -pox v ^ i c snzvmoi 

Pseuaoaoiws sc. --: =-r c . -n-ynu), 

v._s„_osec m e= cni n; 

• s «f^ M as so. . 22; ' 0r a 

:err.-H 9c 0 .,1 ^ " tne ?s ' ^docir.a (also 

^?«-:<--= er.zy.,- cescribed in WO 86/09357 and us 



5 , 3 3 q t 5 3 £ G 

25 



variants there-; as described in US * ^s? rc- 
y-^sa or -s. r— ae, or ? s . 5vr:::ae, o- *s 

*-scor.smensis (WO 



: ?n ; ? ~ . " ~ ' A - 5C ^" sinensis 

:± "° Z ~- V3yi °-' 5 S = rain =' Villus so., e .c. the 

suozxiis described 1 v e 



3artois et 



11 •< (1993) Biochemica et 



3i nn - v ..;., =j.ocnemica et 
a-opnj,sica acta i!3i, 2"-?nn r- 

^— ^ou, o. a . scearothermoohilus (J = 

04/774 4 9S2) or B. pumilus (WO 91/16422) or a st- ain c 

30 so e a <: c • • Streptomyces 

s?w e.g. S . scales, or a strain cf Chromobaeterium so - G c 



vis co sum . 



In connection =tle fTOfcM „ s= . Ums-s i- h, v. 

35 n„„oio gy , su=h „ lmt m h ^ l0 ; v s ; t s x ; 0% h ' h — 

lea«?r- Qn^ , % homology or at 

i-y avanaole as tiposams, Ps. ae f U?J „ OS a E , 2 

~«U ,, ™,».» P M 1. ae IU5i „os a TE ^ . 

V. -09, Ps. ps . udMjcjJiire „„ B1 , ps 5j „ ase _ pj ceMc . a 
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Ps. cepacia M-12-33, Ps. sp. KWI-56, Ps. putida IFO 3458, Ps. putida 
IFO 12049 (Gilbert, E. J., (1993), Pseudomonas lipases: 3iochemical 
properties and molecular cloning. Enzyme Microb. Technol., 15, 634- 
645) . The species Pseudomonas cepacia has recently been reclassified 
as Burkholderia cepacia, but is termed Ps. cepacia in the presen* 
application. 

Also genes encoding lipolytic enzymes from yeasts are relevant, ans 
include lipolytic genes from Candida sp., in "particular Candies 
rugosa, or Geotrichum sp. , in particular Geotrichum candidum. 

Specific examples of microorganisms comprising genes encodinr 
lipolytic enzymes used for commercially available products and which 
may serve as donor of genes zc ce shuffled according to the 
15 invention include Xur.icola lanuginosa , used in Lipciase®, LipolaseS 
Ultra, ?s . nendocir.a used in Lur.af asr®, Ps . aicaligenes used ir. 
Lipomax®, Fusariu- solani, Saciilus sp. (US 5427935, E? 52 3 323) , 
Ps. mendocina , used in Liposar/o. 

20 Also the Pseudononas sp. lipase gene shown in SEQ ID NO 14 are 
specifically contemplated according cc the ir/ention. 

It is to be emphasized that genes encoding lipolytic enzyme to be shuffled 
according to the ir.ventic- r.ay be any of the above mentioned genes c; 

iz> lipolytic er.zyr.es ar.d ar.y variant:, ~oci:ication, or truncation thereof. 
Examples of such genes wrier, are specifically contemplated include the 
genes encoding the er.2yr.5s described in WO 92/05249, WO 94/01541, WO 
94/^4951, WO 94/25577, WO 95/22615 and a protein engineered lipase variants 
as described in E? 407 225; a protein engineered Ps . mendocina lipase as 

30 described in US 5,352,594; a cutinase variant as described in WO 94/14964; 
a variant of an .Aspergillus lipolytic er.zyr.e as described ir. E? patent 
167,309; and Pseudomonas sp. lipase described in WO 95/06720. 



A request to the DNA sequences, encoding the polypeptide (s) , to be 
shuffled, is that they are at least 60%, preferably at least 70%, 
better more than 80%, especially more than 90%, and even better uo 
to almost 100% homologous. DNA sequences being less homologous will 
have less inclination to interact and recombine. 



It is also contemplated according to the invention to shuffle parent 
(homologous) wildt type organisms of different genera. 
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Further, the DNA fragment (s) to be shuffled may preferably have a 
length of from about 20 bp to 8 kb, preferably about 40 bp to 6 kb, 
more preferred about 80 bp to 4 kb, especially about 100 bp to 2 kb,' 
5 to be able to interact optimally with the opened plasmid. 

* 

The method of 'the invention is very efficient for preparing po- 
lypeptide variants in comparison to prior art method comprising 
transforming linear DNA fragments /sequences . 



The inventor found that the transformation frequency of a mixture oi 
opened plasmid and a DNA fragment^ were significantly higher than 
when transforming a plasmid cut at the same site alone. The trans- 
formation frequency of the opened plasmid and DNA fragment were as 
high as for. uncut ciasmic. 

Without being limited to any theory it is believed that the ooening 
or the piasmid(s) restrict (s) the replication of (opened) plasmid (s> 
wner= not interacting with at least c.-.e DNA fragment. In accordance 
vith this an increase: number cf recc.— ined DNA sequences were found 
alter only one shuffling cvcie. 



HS CSS=ribSd L ~ £xa " ?i9 1 50% cr " zhe resulting transf onr.ants 
contained recombined DNA sequences of both input DNA sequences. As 
high as 20* of the total .number of recombined DNA sequences were 
"rando..." mixtures [i.e. having more than one region of nucleotides 
exchanged) . 



The input DNA sequences may be any DNA sequences including wild-type 
DNA sequences, DNA sequences encoding variants or mutants, 'or 
modifications thereof, such as extended or elongated DNA sequences, 
and may aiso be the outcome of DNA sequences having been subjected 
to one or more cycles of shuffling (i.e. output DNA sequences) 
according to the method of the invention or any other method (e.g. 
any of the methods described in the prior art section) . 

When using the method of the invention the output DNA sequences 
(i.e. shuffled DNA sequences), have had a number of nucleotide (s) 
exchanged. This results in replacement of at least one amino acid 
within the polypeptide variant, if comparing it with the parent 
polypeptide, it is to be understood that also silent mutations is 
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contemplated (i.e. nucleotide exchange which does not result in 
changes in the amino acid sequence) . 

However, the method of the present invention will in most cases lead 
to the replacement of a considerable number of amino acid and may in 
certain cases even alter the structure of one or more polypeptide 
domains (i.e. a folded unit of polypeptide structure). 

According to the present invention more than two" "UNA sequences are 
shuffled at the same time. Actually any number of different DNA 
fragments and homologous polypeptides comprised in suitable plasmids 
may be shuffles at the same time." This is advantageous as a vast 
number of quite different variants car. be made rapidly without ar. 
abundance of iterative procedures. 

The inventor have tested the nucleotide shuffling method of the 
invention using significantly more than two homologous DNA 
sequences. As describee m Example 2 was surprisingly found that 
the method of the invention advantageously can be used for 
recombining more than t w o Z \ i ■-. sequences . 

One cycle cf shuffling according to the me t hoc of the invention may 
result in the exchange of from 1 to 100" nucleotides into the opened 
piasmid DNA sequence encoding the polypeptide in question. The 
exchanged nucleotide seouer.ee is) may be continuous o r ma v be present 
as a number of sub-sequences within the full-length sequence { s } . 

To support the present invention the inventor made a number of 
additional experiments on different aspect on the method of the 
invention. The experiments are described below and illustrated in 
the Example 3 to 6 below. 

A number of vectors and fragments comprising an inactivated 
synthetic Humicola lanuginosa lipase genes were constructed by 
introducing f rameshif t/stop codor. mutations in the lipase gene at 
various positions. These were used for monitoring the in vivo 
recombination of different combinations of opened vector (s) and DNA 
fragments. The number of active lipase colonies were scored as 
described in Example 3. The number of colonies determines the 
efficiency of the opened vector is) and fragment (s) recombination. 
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3 



One frameshift mutation in said Humicola lanuginosa lin*c 

of the o penlng site gave 3 „ 32 , ot active » «d. 

le 1 l0 ""° n aM " - co„=x udei that 

"V " """"" iS " the <* the vector t>e 

Higher mixing. e 

one frameshift mutation it the opened ve „„ and ^ 

fraoment on each aide of the openmc, aite oave J. r . 

coio„ les depe „ di „ 9 on t „. ^ 

.«». eoaoni., can he c„ sld „ ed to he moa.ics, „„ t o„ ly ateo 
to the opening site. ■ y Lated 

Two frameshift T^Mn-c i„ -w 

u - tl0 - iS - n " ne o?eneo vector or. each S i do 
op-.-.., si.e anc cr.e i- ;h e frarr.e.-.t « s -~ , . 

colonies ceoer.d-' - ~ - - a " ""^ or a "ive 

~ " r w " *^ wa r. ar.c "crimination M n ~- ~- 

--oi^.ieo are mosaics c: the "c* — " -v- 

Two frameshift muta-ons in the c— — n - 

— o. o.. eacr. side of the 



opening site and a 
colonies deoend 1 ': 



i- -ype ::a:~en: rave 7 7 ^ v -?« 

lC.il?.9 ^onanfl,, ^ •. . k ^ - 



- : also ro'jr.c 

and =ho S1 , 0 0 . .J"!' """ ^ '"'"■° rS reiativ e to fragments 

o. t .. e -:a;,e,:s are also :nfiuencing che result. 

JSi.ig of the s. ce:ev-5^o = 

cell showed t r. a r = _ . . — ^m c .:on host 

■ ,, " = '-- an - trans rormed very well with v«ld 

<->pe piasmia(s) and excress-d t'- Ku~-'~~>~ , a 

k..- " lanuginosa li D as° cen» 

but gave no transf o-.ar.ts a- a i i w-: th . 9 ' 

fragments. ^ 0PeneG Vectors a ^ 



Tne RAD52 function is recu-- f~- vi s ., iral 

nn( . c • ' cia ^sical recombination" (bur 

not for unequal sister-s<- -a-H -.-.^.^ . . . 1 C 

rho . u . — o..c recoiRomation) showing that 

the recombination of ooen=d vec-->- a-^ - 

^.i- ve ^ 3 - a— -ragment could involve a 
c-assical recombination mechanism. 

Classical recombination is the recombination mechanism invoked in 
the recombination between genes located on nonsister chJ f < 

T:TsZ chr — - defi - d - - ~ ~ :ir 

! h nd Symn9t ° n LS (1991 » "Recombination in Yeast", page 407-522 
» ^e Molecular and Cellular Biology of the Veast 3.^.^1. 



WO 97/07205 



15 



PCT/DK96/00343 



Volume 1 (eds. 3roach JR, Pringle JR and Jones EW) , Cold Serine 
Harbor Laboratory Press, New York. 

Multiple partially overlapping fragements 

The inventor also tested recombination of multiple partial 
overlapping fragments using the method of the invention. 

The recombination of 2 and 3 partial overlapping fragments into a 
gapped (i.e. that the opening result in cut ting" out' of a little 
part of the gene) vector were tested and gave a high recovery of 
recombined Humicola lanuginosa lipase gene. The recovery of active 
lipase gene from different combinations of inactivated Humicola 
lanuginosa genes was tested for the recombination of 2 partial 
overlapping fragments. The tendency -as a higher mixing ir. the 
overlapping regie- between the 2 fragments in the gapped region 
than in the ve:::: and fragment cver.ap. 

When recombir.ir.g many fragments :r:r. the same region, the multiple 
overlapping fragment technique will increase the mixing by itself, 
but it is also important to have a relative high random mixinc in 
overlapping regions in order tc mix closely located 
variants /difference s . 

An overlap as small as 10 bp between two fragments were found to be 
sufficient to obtain a very efficient recombination. Therefore, 
overlapping in the range from 5 to 50CD bp, preferably from 10 bp to 
500 bp, especially 10 bp to 100 bp is suitable according to tne 
method of the invention. 

According to this embodiment of the present invention 2 or more 
overlapping fragments, preferable 2 to 6 overlapping fragments, 
especially 2 to 4 overlapping fragments may advantageously be used 
as input fragments in a shuffling cycle. 

Besides increasing the mixing of genes, this is a very useful 
method for domain shuffling by creating small overlaps between DNA ' 
fragments from different domains and screen for the best 
combination . 

For instance, in the case of three DNA fragments the overlapping 
regions may be as follows: 
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- the first end of the first fragment overlaps the first end of the 
opened plasmid, 

- the first end of the second fragment overlaps the second end of 
the first fragment, and the second end of the second fragment 
overlaps the first end of the third fragment, 

- the first end of the third fragment overlaps (as stated above) the 
second end of the second fragment, and the second end of the third 
fragment overlaps the second end of the opened plasmid. 

It is to be understood that when using two or more DNA fragments as 
starting material it is preferred to have continuos overlaps between 
the ends of the plasmid and the DNA ■ fragments . 

Even though it is preferred to shuffle homologous DNA sequences in 
the form of DNA fragment i's} ar,: c-per.ec plasmid (s), it is also 
contemplated according to the :r.ve:.:icn to shuffle two or more 
opened piasmics comprising hcm:lcrcus DNA sequences encoding 
polypeptides. However, in such :-e it is compulsory to open the 
piasmics at different sites. 

In an further embodiment of the invention two or more opened 
piasmics and one or more hcmoicgcus DNA fragments are used as the 
starting material zz be shuffler.. The ratio between -the opened 
plasmid (s) and homologous DNA fragment (s) preferably lie in the 
range from 20:1 to 1:50, preferable frcm 2:1 to 1:10 (moi vector:mol 
fragments } with the specific concentrations being from 1 pM to 10 M 
of the DNA . 

The opened piasmics may advantagcusiy be gapped in such a way that 
the overlap between the fragments is deleted in the vector in order 
to select for the recombination) . 
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Preparing the DNA fragment 

The DNA fragment to be shuffled with the homologous polypeptide 
comprised in an opened plasmid may be prepared by any suitable 
method. For instance, the DNA fragment may be prepared by PC?. 
5 amplification (polymerase chain reaction), as described above, of a 
plasmid or vector comprising the gene of the polypeptide, usinc 
specific primers, for instance as described in US 4,683, 202 or Saiki 
et al., (1988), Science 239, 487 - 491. The DNA fragment nay also be 
cut out from a vector or plasmid comprising the"des-ired DNA sequence 
10 by digestion with restriction enzymes, followed by isolation usinc 
e.g. electrophoresis. 

The DNA fragment encoding the homologous polypeptide in question m av- 
al ce r native i y be prepared synthetically by established standard 

15 methods , e.g. the phosphoamidi te ~e::.:: desonoed by Beaucace and 
Caruthers, {1931; , Tetrahedron Letters 22, 1E59 - 13-9, or the 
method described by Matthes et al., 1934:, EM50 Journal 3, 301 - 
305. According to the phosphcami dit e method, oligonucleotides are 
synthesized, e.g. ir. ar. automatic IN' A synthesizer, purified, 

20 annealed, iigated and oloned : n suitable vectors. 

furthermore , the DNA fragment may be of ~ i x e d synthetic and genomic, 
mixed synthetic and cDNA or mixed genomic and cDNA origin prepared 
by i i gating fragments of synthetic, genomic or cONA origin {as 
2 5 appropriate), the fragments corresponding to various parts of the 
entire DNA sequence, in accordance with standard techniques. 

The plasmid 

The plasmid comprising the DNA sequence encoding the polypeptide in 
30 question may be prepared by iicating said DNA sequence into a 
suitable vector or plasmid, or by any other suitable method. 

Said vector may be any vector which may conveniently be subjected to 
recombinant DNA procedures. The choice of vector will often depend 
35 cn the recombination host ceil into which it is to be introduced. 

Thus, the vector may be an autonomously replicating vector, i.e. a 
vector which exists as an extrachromosomal entity, the replication 
of which is independent of chromosomal replication, e.g. a plasmid. 
40 Alternatively, the vector may be one which, when introduced into the 
recombination host cell, is integrated into the host cell genome and 
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replicated together with the chromosome (s) into which it has been 
integrated. 

To facilitate the screening process it is preferred that the vector 
is an expression vector in which the DNA sequence encoding the 
polypeptide in question is operably linked to additional segments 
required for transcription of the DNA . In general, the expression 
vector is derived from a plasmid, a cosmid or a bacteriophage, or 
may contain elements of any or all of these. ----- 

The term, "operably linked" indicates that the segments are arranged 
so that they function in concert for their intended purposes, e.g. 
transcription initiates in a promoter and proceeds throuch the DN.~ 
sequence coding for the polypeptide ::. question. 

The promoter may be any DNA sequence which shows transcriot ional 
activity in the recombm a t ion host cell of choice and mav ce derived 



cmoioccus or 



trom genes encoding croteins 



Heterologous to the host ceil. 

Examples of suitable promoters for use in yeast host ceils include 
promoters from yeast glycolytic genes (Hi tzeman et al . , ( 1980 ) , J. 
Biol. Cherr.. 255, 12073 - 12050; Aiber and Kawasaki, (1932), J. Mol . 
Appi. Gen. l, 419 - 4 34 ) or alcohol dehydrogenase genes (Young et 
ai., in Genetic Engineering of Microorganisms for Chemicals 
(Holiaender et ai, eds.;, Plenum Press, New York, 1932), or the TPI1 
(US 4,599,311) or ADH2-4c (Russell et ai., (1983), Nature 33*4, 652 - 
654) promoters. 



Examples of suitable promoters for use in filamentous fungus host 
cells are, for instance, the ADK3 promoter (McKnight et al,, (1985), 
The EM BO J. 4, 2093 - 2099) or the tpiA promoter. Examples of other 
useful promoters are those derived from the gene encoding A. oryzae 
TAKA amylase, Rhizomucor miehei aspartic proteinase, A . niger neu- 
tral a-amylase, A. niger acid stable a-amylase, A. niger or A. 
awarnori glucoamylase (gluA) , Rhizomucor miehei -lipase, A. oryzae 
alkaline protease, A. oryzae triose phosphate isomerase or A. 
nidulans acetamidase . Preferred are the TAKA-amylase and gluA 
promoters „• 
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The DNA sequence encoding polypeptide in question invention may 
also, if necessary, be operably connected to a suitable terminator, 
such as the human growth hormone terminator (Palmiter et al . , op. 
cit.) or (for fungal hosts) the TPI1 (Alber and Kawasaki, od^ cit . ) 
or ADH3 (McKnight et al . , oo^ cit. ) terminators. The vector may 
further comprise elements such as polyadenylation signals (e.g. from 
SV40 or the adenovirus 5 Elb region), transcriptional enhancer 
sequences (e.g. the SV40 enhancer) and translational enhancer 
sequences (e.g. the ones encoding adenovirus VA RNAs) . 

The vector may further comprise a DNA sequence enabling the vector 
to replicate in the recombination frost cell in question. 
When the host cell is a yeast cell, suitable sequences enabling the 
vector to reoiicate are the veas: piasmid 2m replication genes RE? 
1-3 and origin of replication . 

The oiasmid o';'l car. be usee for ;:::uc:::r, c: useful proteins and 
oepcices, us;:.: filamerccus fur. r _ . such as Aspergillus sp . , and 
yeasts as recombinant r.cs: cells J? 2 52 4 5777 -A) . 

The vector mav also ccr.cr i se a selectable marker, e.g. a gene the 
product of which complements a defect in the recombination host 
cell, such as the gene codmc for dihyarof oia te reductase ( DHFR) or 
the Schizosaccr.arc~yzes pemure T?I gene (described by P.P.. Russell, 
(1935) , Gene 40, 125-130) . 

Another example of such suitable selective markers are the ura3 and 
leu2 cenes which complements the corresponding defect genes of e.g. 
the yeast strain Saccharorrr/ces cerevisiae YNG31S. 

The vector mav also comprise a selectable marker which confers 
resistance to a drug, e.g. ar.picillin, kanamycin, tetracyclic 
chloramphenicol, neomycin, hycromycin or methotrexate. For fi- 
lamentous fungi, selectable markers include amdS , pyrG , aro3 , niaD , 
sC, trpC , pyr4 , and DHr R . 

To direct the polypeptide in question into the secretory pathway of 
the recombination host cell, a secretory signal sequence {also known 
as a leader sequence, prepro sequence or pre sequence) may be 
provided in the recombinant vector. The secretory signal sequence is 
joined to the DNA sequence encoding the lipolytic enzyme in the 
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correct reading frame. Secretory signal sequences ar„ 
Positioned 5 . t0 the DNA seance encoding L ^ ^1 
secretory signal sequence may be the signal normally as7ocia ted I 
the polypeptide in question or may be from a gene encod" fl 
5 secreted protein. encoding another 

The signal peptide may be naturally occurring signal p eptide or ' 
functional part thereof, or it may be a synthetic P P ^ ~ 
secretion from yeast cells, suitable signal peptide h JL k 
10 - be the a-factor s.gnal peptide ( cf ? ,^0 :^:: 

nTsT mou - salivary amyiase (cf - °- Ha ~^ 
: :r e L 89 ; al 6 r 646, ' 1 a — — xypeptidase Sl n - a ; 

. P-a- I... l.„. vails e: al., a9871 Cell e87 .„, . 

_ ^coj. iMWJM „ !ipase :!w ve 88 ; >■ t» 

r> Dec-^wo /--- r.r- - ^ - aA *i signal 

137}. feast 3 , 127- 



0 ^^^.ir^^^^-irr,^ 1 "' * — 

Goici apparatus ar ; d further to a secreto-v ves^c'. * 
> c -- a • vesicie £or secretion 

c :;.r: u ' 9 - •»■»-■-■■--->■•> P.i wew . ,c,o SS 

-- =— — - ..- =-- .east -rough -.he cellular -e.-a-a-,. rl> 
^^Pia^c aoace of :„e y e a ,. ceil, It . leide , 



-9. US 



^ *' - 

-^6,052, E P 16 201, E? 123 294. EP 123 544 and E? , 63 529) 
alternatively, the leader peptide rr.av be a sv-^--- 1a A ^ 

a!:' iv U ;: " -' ilame "" US P.P«ld. »a y convenient bs 

d--iv.= from a gene encoding an Aspergillus. „. anylas / 0 . 
,lucoa„ yi ase, a ,.„, encoding a KMzoaucor oiehei ^ 

Prefe.a.iy derived from a gene encoding A. oryzae TAKA amylase a 
»io„ r.eucral a-a» y lase, A. niger acid-stable Myl .„, or , „' . r 
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glucoamylase . 

The recombination host cell 

The recombination host cell, into which the fixture of plas- 
5 mid/fragment DNA sequences are to be introduced, may be any 
eukaryotic cell, including fungal cells and plant cells, capable of 
recombining the homologous DNA sequences in question. 

According to prior art prokaryotic microorganisms/— such as bacteria 
10 including Bacillus and E. coli; eukaryotic organisms, such as 
filamentous fungi, including Aspergillus and yeasts such as 
Saccharomyces cerevisiae; and tissue culture cells from avian or 
mammalian origins have been suggested for in vivo recombination. Ail 
of said organisms can be used as recombination host cell, but ir. 
15 general prokaryotic ceils are not sufficiently effective (i.e. does 
not result in a sufficient number z: variants) to be suitable for 
recombination methods for industrial ;se, 

Consequently, preferred recombir.a t i : r. host ceils according to the 
20 present invention are fungal cells, such as yeast cells or filament- 
ous fungi. 

Examples of suitable yeast ceils include ceils of Saccha rcrr/ces sp., 
in particular strains of Saccharomyces cerevisiae or Saccharo-yces 

25 kluyveri or Schizosazcr.aror.yces sp., Methods for transforming yeast 
cells with heterologous DNA and producing heterologous polypeptides 
therefrom are describee, e.g. in US 4,599,311, US 4,931,373, US 
4,870,006, 5,037,743, and US 4,345,075, ail of which are hereby- 
incorporated by reference. Transformed cells may be selected by, 

30 e.g., a phenotype determined by a selectable marker, commonly drug 
resistance or the ability to grow in the absence of a particular 
nutrient, e.g. leucine. A preferred vector for use in yeast is the 
POT1 vector disclosed in US 4,931,372. The DNA sequence' encoding the 
polypeptide may be preceded by a signal sequence and optionally a 

35 leader sequence, e.g. as described above. Further examples of 
suitable yeast cells are strains of Kluyveromyces , such as K. 
lactis, Hansenula, e.g. H. poiymorpha, or Pichia f e.g. P. pastoris 
(cf. Gleeson et al.,(1986), J. Gen. Microbiol. 132, 3459-3465; US 
4,862,279) . 

40 
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Exanpl.3 of other fungal ceils are cells fil, 
A s Pe r gi U US5p ., Neurospoza ^ Fusariu l - ' — .fungi, 

Particular straps of „. or yzae , A . nidul J s J " 

ASPer * iU ° S ^ the expression of pr ot . ins "sT °' 

e.g., EP 272 277, E P 230 023 Th* r P ° tSinS is described in , 

(1989), Gene 78, 147-156 "^^ardier et al., 



In a preferred embodiment -h* 
10 «U is a cell o/"' ^-^ination host 

cerewsiae. ^ 5 '«*"-yc„, in particular ^ 



METHODS AND MATERIALS 



- - c a s e e r. " r 



ON A sscusnc 



2 0 v* 




f=) D57G,NS4K 
(c) £37K,G9iA 

' s ) 

ff) Z210K 



D551, 
09 5.-;, 



K237 i v,Z252L..?255T,G263A,L264Q 
-^:7c.,G9iA,-95L,D96?,{<93:, {X2 37M) 



30 



35 



40 



Variants used 



preoa r i r. 



DNA 



<g) S93T,N94K,D95N 

(h) E87K,D95V 

(i) N94K, D96A 

U) E87K,G91A, D96A 
<k) D167G,E210V 
(1) S33T,G91A,Q24 9R 
(m) E87K,G91A 

<") S "T,E87K,G91A,N94K / D96N,D111K 

t°> N73 D-E87K,G91A,N94I,D96G 

tP) L67P ' I7 6V,S83T, E 87N,I90N,G91A,D96A,K98R 



~ r a cms n ti s 



_s tandard pc?. 
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(q) S8 3T,E87K,G91A,N92H,N94K, D96M 

(s) S85P,E87K,G91A, D96L,L97V. 

(t) E87K,I90N,G91A,N94S,D96N, HOOT. 

(u) 1347,554?, F80L, SS5T, D9 6G, R108W, G109V, Dl 1 1G, S116P, L12 4 S , 
5 V132M,V140Q,V141A, F142S, H145R, N162T, I166V, F181P,F183S, 

R205G, A243T, D254G, F262L. 
(v) E56R,D57L, I90F,D96L,E99K 
(x) E56R, D57L, V60M, D62N, S83T, D96P, D102E 
(y) D57G,N94K, D96L, L97M - 
10 (z) E87K,G91A, D96R, I100V, E12 9X, K237M, I252L, P256T, G263A, L2 64Q 
(aa) E56R, D57G, S53F, D62C, T64R, E87G, G91A, F95L, D96P, K98I 

Strains : 

Expression syste.T. host: 

15 Saccharo~yces cerevisiae YN3 2 1 5 : X„-.Ta :pep4 fcir' ; ura3-52, Ieu2-D2, 
his 4-539 

Saccharorr/ces cerevisiae ?.az:2: Szrair. 2:1532 = M~. Ta zaz:2 zra2, 

obtained frcr?. Trrster. )'< l L s s z r. 7 ; rrer. , Institute oz 3-er.etics, 



20 



?ias.*?.ids : 



o YES 2.0 









i c u r e 







i rar.sr orrtaticr. 



uraj 
ieu2 



30 Media 

SC-ura": 90 rti 10 x Basal salt, 22.5 ~1 20% casar.ino acids, 9 rr.i 1\ 
tryptophan, H 2 0 ad 806 ~1 , autociaved, 3.5 ml 5% threonine and 90 ml 
20% glucose or 20% galactose added. 

L3-medium: 10 c Bacto- tryptone , 5 g Bacto yeast extract, 10 g NaCi 
35 in 1 litre water. 

BriUiant Green (3G) (Merck, art. No. 1.01310) 

BG-reagent: 4 mc/ml Brilliant Green (3G) dissolved in water 

Substrate 1: 

10 ml olive oil {Sigma CAT NO. 0-1500) 
40 20 ml 2% polyvinyl alcohol (PVA) 
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The Substrate is homogenised for 15-20 minutes. 
Methods : 

5 Construction of yeast expression vector 

The expression plasmids P JS026 and P JS037, are derived f- 0 m pYES 
2.0. The inducible GALl-promoter of pYES 2.0 was replaced "with' the 
constitutively expressed TPI (triose phosphate isomerase) -promoter 
from Saccharomyces cerevisiae (Albert and Kirwasatei- r -(1982) J Mol 
10 Appl Genet., 1, 419-434), and the ura3 promoter has been dieted a 
restriction map of pJS026 and P JS037 is shown in figure 3 and fi gu . e 
4, respectively. 

Preparation of the wilri-tvpe ON" A fracme.-.t 
-5 A lipase wild-type DMA fragment car. :e prepared either bv * C > 
amplification (resulting in lev, med^ or high mutagenesis;, o* th- 
pJ5C2 6 pi„., ld sr by cutting , he ;r£ _ _ oy diges . ;ro w;r ; 



JO 



a suitable restrict-'::- 



er. zvme 



Fermentation cf X;-,ccl* l ar.uzir.es* l:case var;an:s ■> v^c- 
10 ml cf SC-ura' medium is ir.oculatec v:th a 5. cerevisiae colony 
and grown at 30'C for 2 days. The 1C ml ;s us- for inoculating 300 
Sv ' Ura ~ ed1 -' Which 13 -rown at 2!'C for 3 days. The 300 mi is 



used for 


inoculation 5 1 the fciioving 


4 00 g 


Ami case 


5.7 g 


yeast extract ; 


12.5 g 


L-Leucir. (fiuka; 


6.7 c 


(NH 4 J 2 SO< 


10 g 


MgSC ; -7H : 0 


17 g 


K 2 SO< 


10 ml 


Trace compounds 


5 ml 


Vitamin solution 


6.7 ml 


H 3 PO, 


25 ml 


20% Pluronic (antifoam) 


In a tota 


1 volume of 5000 ml : 



The yeast cells are fermented for 5 days at 30 °C. They are given a 
start dosage of 100 ml 701 glucose and added 400 ml 70% glucose/day. 
A pH-5.C is kept by addition of a 10% NH, solution. Agitation is 300 
rpm for the first 22 hours followed by 900 rpm for the rest of the 
fermentation. Air is given with 11 air/l/min for the first 22 hours 
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followed by 1.5 1 air/l/min for the rest of the fermentation. 



Trace compounds : 
6.8 g ZnCl 2 

54.0 g FeCl 2 *6H 2 0 

19.1 g MnCl 2 '4H 2 0 
2.2 g CuSO<*5H 2 0 
2.58 g CoCl 2 
0.62 g H 3 B0 3 

0.024 g (NH 4 )5Mo 7 0 24 -4H 2 0 
0.2 g KI 

100 ml HC1 (concentrated) 
In a total volume of 1 1 . 



Vitamin solution : 
250 mo 3iotm 

10 c 3-Caiciumpan — eto-at 

100 c Myo-Inositcl 

I . 5 g Pyridoxin 
1.2 c Niacmamio 
C4 g roiicaoid 
C-4 c Ribofiavi r. 

In a total v o 1 ume o : 11. 



Transformation of yeast 

Saccr.aro-yces cerev^siae is t ransf orr.e: by standard methods (cf. 
Sambrooks et ai., [1925), Molecular Cloning: A Laboratory Manual, 
2nd Id., Cold Spring Harbor) 

Determination of y east transformation frecuencv 

The transformation frequency is determined by cultivating the 
transf ormants on SC-ura'piates for 2 days and counting the number of 
colonies appearing. The number of trans f ormants per mg opened 
plasmid is the transformation frequency. 

Screening for positive variants with improved wash performance 

The following filter assay can be used for screening positive 

variants with improved wash performance . 
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Low calcium filter assay 

1) Provide SC Ura" replica plates (useful for selecting strains 
carrying the expression vector) with a first protein binding filter 
(Nylon membrane) and a second low protein binding filter (Cellulose 
acetate) on the top. 

2) Spread yeast cells containing a parent li pase gene or £ muca _ ed 

lipase gene on the double filter and incubate for ? o,- i ^ 
30 o c _ * or - da ys at 

3) Keep the colonies on the top filter by transferring the too- 
filter to a new plate. 

4) Remove the protein binding filter to an empty petri dish. 

5) Pour an agarose solution comprising an olive oil emulsion (2% 
?VA:olive oii=3:l!, Brilliant green 'indicator. 0.004%) , 1Q 0 mM tris 

5 bu''^" o u ' 0 ^ ~. ~ r ~ ~ . 

. ... v a . concentrat:cn 5;nM) en the hotter?, filter 



10 



10 



so as 

• — - ■ - — ■ ■ — - . , — ~ " — . _ . . . . 

:orm of 
^lue-oreer; s o o ** ^ 



^ . — _ . _ . r - - - - s e a <. v 1 1 v in t r. i 



mc a reduce 



leoenoe ncy 

ror caici — as co-pared the parent lipase. 

' — L - J: ^ i "- s - s - - s ---^ 2??liea Biosystems A3 1 DNA 

sequence model l~3- = -~ 

r - — = — — s Protoco. m the A3I Dve 

Terminator Cvci< 



^ O TM © -» <~ ' 



Assess inc the e : f i e r. c v o : re" orJo ; ^ * - ; ~ 

5 e -^-ie-cy or t.-.e opened 

vector ana r ragmen t recombination. The percentage of colonies with 
active lipase activity gives an estimate of the mixing of the 
active and inactive genes - theoretically it can be calculated for 
one frameshift that the closer to 501 the better mixing if eoual 
likelihood of wild type and frameshift, 25* for 2 frameshifts 'and 
12.5% for 3 frameshifts. 

Frameshift mutation 

The frameshift mutation were created either by filling in a 
restriction site (in case of 5' overhang) or deleting the "sticky 
ends" (in case of 3' overhang) by DNA polymerase with or without 
dNTP (deoxynucleotides = equal amounts of dATP, dTTP, dCTP and 
dGTP). Methods for filling in of restriction sites (referred to as 
"J" on Figure 7) and deleting the sticky ends (referred to as 
"(D)" on Figure 7) are well known in the art. 
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Method for assessing colonies with lipase activity 

The number of colonies and positives {i.e. with lipase activity) are 
calculated as the average of 3 plates. 
5 The cultivation condition and screening condition used is the 
following: 

1) Provide SC Ura-plates with a protein binding filter (Nylon 
filter) onto the plate. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
10 lipase gene on the filter and incubate for 3 or 4 days at 30°C. 

3) Remove the protein binding filter with the colonies to a petri 
dish containing: An agarose solution comprising an olive oil 
e~uision {2\ PVA: Olive 011=2:1), Brilliant green (indicator, 0.0041) , 
100 m>; tris buffer pH 9. 

15 5) Identify colonies expressing lipase a::ivi:y ::. tr.e form cf ciue- 
greer. spots . 

EXAMPLES 

20 

Example 1 

Testing in vivo re combina t ion of two hcrtolocous cer.es 
The Saccharcr.yces rerevisise expression piasrti:: pJS02o was 
25 constructed as described above in the "Material and Methods"- 
section. 

A synthetic Hu^icola lanuginosa lipase gene (in pJ5037) containing 
12 additional restriction sites (see figure 4) was cut with Nrul , 
30 ?stl, and Nrul and ?stl, respectively, to open the gene 
approximately in the middle of the DMA sequence encoding the lipase. 

The opened plasmid (pJS037) was transformed into Saccharomyces 
cerevisiae YNG318 together with an about 0.9 kb wild-type Humicola 
35 lanuginosa lipase DNA fragment (see figure 1) prepared from pJS026 
by PGR amplification. 

Further, the opened plasmid was also transformed into the yeast 
recombination host ceil alone (i.e. without the 0.9 kb synthetic 
40 lipase DNA fragment) . 



The transformed yeast cells were grown as described in the "Ma- 



5 



10 
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terials and Method"-section above, and the tran.f 

was determined as described above. C — Nation f reguency 

It was found that the transformation frequencv of the « • 

alone was very low (10 transfoanants ;/ f ^ op.„ eo pla anid 

c-p.ri.on to the transforation frequency PlU **> > ^ 

1 50, 000 transforms per mg opened P la S mi d) ^"'"oa.nt 

The clasmid/fragment was PCR amplified re.ultiag-^- J0 , , 
containing fragBenta CQVering ^ n ^ '"-founts 

-combined plasmid/f rag^ents . The recombinatioTLT 10 * ^ 
transplants were analyzed by restriction Tit. °' ^ 20 

standard methods. The result is disoiaved in ^ \ ° 



Tab! 



20 



?2 
?3 



sg sc sc ~ ~ *~ 



2o ?5 ..: - 



?0 sc 



sg sc 



Nl 
N2 

30 N3 
N4 
N5 
No 

35 p/Ni 



5g - sg ig 




sg 



S g s | ^ *c ^g sg s 

40 ? /N6 sc s 5 , S 5 5 - s ? sg nd 



P/N7 nd w * J? na sg sg sg 
^ ^ si ^ ~ 5| ^ St 



P: plasmid ODened with D stl 
D/M Pla f mid .°P ened wic ^ NRuI 



sg nd 



P/N: plasmid opened with PstI an- nr„t , , • 

a 75 bp fragment) " a ^ NRui Resulting in the removal 0 * 

As can bee seen from Table. 1 in - 

om Table 1 10 transf ormants (equivalent to 501, 
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contained recombined DNA sequences. 4 of these 10 DNA sequences 
(equivalent to 20%) contained either a region of the wild-type gene 
recombined into the synthetic gene or a region of the synthetic gene 
recombined into the wild-type fragment. 



Example 2 

In vivo recombination of Humicola lanuginosa lipase variants 

The DNA sequences of 20 variants of the Humicola lanuginosa lipase 

were in vivo recombined in the same mixture. 

Six vectors were prepared from the' lipase variants (a) to (f) (see 
the list above) by ligation into the yeast expression vector p JS037 . 
All vectors were cue open with Nrul. 

DNA fragment of all 23 homologous ~NA sequences (c) to (aa) (see the 
list above) were prepared by ?C?, amplification using standard 
me t h ods 



The 20 DNA fragments and the £ opened vectors were .mixed and 
transformed into the yeast 5a ccha rzr.yces cerevisiae YNG313 bv 
standard methods. The re ccmbinat ion host cell was cultivated as 
described above and screened as described above. About 20 trans- 
formants were isolated and tested for improved wash performance 
using the filter assay method described in the "Material and 
Methods "-section . 



Two positive trans formants ( named A and 3) were identified using the 
filter assay. 



In comparison to the wild- type amino acid sequence the two re- 
combined positive transf ormants had the following mutations. 

A: D57G, N94K, D96L, P256T 

A is ^a recombination of two variants. 
originates from the vector (d) 

===== originates from the DMA fragment prepared from variant (y) 

B: D57G, G59V, N94K, D96L, L97M, S116P, S170P, N249R 
???? <<<<< ????? ===== 

B is a recombination of vector (c) , DNA fragments (n) and (u) . 
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originates from the vector (c) 

<<<< originates from the DNA fragment prepared from variant (u) 
===== originates from the DNA fragment prepared from variant (n) 
???? Amino acid mutation which is not a result of recombination. 

As can be seen the resulting positive variants have been formed bv 
recombination two or more variants. The amino acid mutations marked 
"?????" are not a result of in vivo recombination, as none of the 
shuffled lipase variants (see the list above) comprise any of said 
mutations. Consequently, these mutations are a result of random 
mutagenesis arisen during preparation of the DNA fragments bv 
standard PCR amplification. 



Example 2 

Recombination with one frameshi:: mu:ar.tions 

Synthetic Hu~iccla l-n-jcir.csz lipase gene (in vector JSC- 3") was 
-ace inactive at various pcsit — s by deleting (positions 134 /335) 
or fiiiing-in (position 2 90/3 17/51 S /~4 5 ) restriction enzyme sites 
or by site-directed introduction of a stop cozzr.. All inactive 
synthetic lipase genes cf 9 : Z bp can ce deduced from Figure 7) . 

A number of different 911 bp DNA fragments were made from the above 
vectors using prirr.er 4 £99 and primer 5164 using standard ? 3 R 
technique. Smaller ?C9; fragments were made using primer 3 437 and 
primer 4543 (260b?) , primer 2343 and primer 4543 (438bp). 

0,5 mi {app. 0.1 mg) cf vectors Blue 425, Blue 425, Blue 423 and 
Blue 429, opened with ?st I (i.e. position 335), vectors Blue 424 
and 31ue 425 opened with Nrul (i.e. position 464) were together 
with 3 ml (app. 0.5 mg) of fragments 424, 425, 426, 42S, 429 in 
varies combination transformed into 100 ml Sacchroniyces cerevisiae 
YNG313 competent ceils as displayed in Table 1A. 

The number of colonies and positives (i.e. with lipase activity) 
were calculated as the average of 3 plates as described in the 
Material and Methods section. 

The result of the test is shown in Table 1A 
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Table 1A 



vector + Fragment 


Number of 
colonies 


% of colonies with active 
lipase activity 


1. Blue 428 + 429° 


114 


16% 


2. Blue 429 + 428# 


645 


3% 


3, Blue 426 + 425# 


276 


25% 


4. Blue 425 + 426 


528 


18% 


5. Blue 425 /Nru I 
+ 426 


539 


28% 


6. Blue 425 + 424 


139 


7% 


7. 31 ue 424/NruI + 
425z 


74 


32% 


8. Slue 423 - 425 j 




121 


9. Slue 423 - wz 
f racier, c 




37% 

! 



Pair vise re co.w :r.a:;or.s of one f r ame s r.i f z mutation on the vector 



and another cn the fragment on the opposite side of the opening 
5 site. = determmec by 9 plates; = determined by 5 plates. 

The first 2 rows of Table 1A displays vectors and fragments with a 
frame shift cn e a c r. side of the ?stl site. The ^mirror image" 
experiment in row 2 compared row 1 gives a reproducible lower 

10 number of active colonies. The same is true for row 3 and 4 even 
though it is not as pronounced. Moving the opening site closer to 
the frameshift in the vector increases the number of actives as 
seen m row 5. This can explain the reason for the difference in 
the ''mirror image" experiments. In both cases the higher number of 

15 positives has the opening site closer to the frameshift in the 
vector . 

It can therefore be concluded that the closer the mutation is to 
the end of the vector the higher chance of mixing. This is probably 
20 arising from the well known fact that free DNA ends have a high 

recombinogenic potential. Therefore it is desirable to have as many 
free DNA ends as possible to increase the mixing of the genes. This 
is for example obtained in the later example with recombination of 
multiple overlapping fragments. 

25 

Row 6 has a rather low number of actives probably due to the 
location of the frameshift on the fragment exactly at the ?stl 
opening site of the vector. 
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Row 7 has the frameshift of the vector close to th. 
and again it g 1V es a hi g h nuab . r of actives Se '° 

^ggfainatiorLwith one ston codon mutanr i on. 

in order to test if there are any difference in the recombin . 
efficiency of , fnn . "combination 

ncy of stop codon mutations compared to frames M f.- , ■ 
cne following experiments were made.. fram «hift mutations 



10 



1 0 



The same way as described above 0.5 ml (apo 0 1^ 

624, Blue 625 and Blue 626 (se» Table i B f L h V6Ct ° rS 

laoie IB) opened with Pc^t 

together with 3 n i ( = o D o = - I G " ecced Agenesis, were 

transformed into lOO^V ^" J '' 625 ani 526 



Veer or - 



dumber of 
colonies 



52* 



4 0% 



2. Sine 52* - 
52 5 



12% 



Slue 52^ - 
624 



15% 




RCW 1 2 {ln Table 13 > the orations locked a^ ^ 

Piace as row I and 2 - r a ^ o ^ . * u ufte sam * 

co.o.es , lck liam ;: t ; :;: « «•» — - « 

25 ™= a:i „, 5 col - " .„ .„ I" °" ly " 1;aer ch » ==0? codon 

= co... ?at . d lC ts, frMMhi;-. autiow, but y.. 

"o hi :: i ':;u:i"::.. th r r — . el ... r 

P tl0n of the method, cives a b^ri-oV • 
30 fr a n, 0<! hi^ ^~ v ^ a setter mixing than 

f-anu.shi.t mutations. Row 3 and 4 confirms that th« rln , 

mutation i Q ^ ne closer the 

" CO tM end 0£ t , e vec „ r h . gher =hMce ^ 
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and one or two frameshift nutations in the fragment 

Using the same approach as described above the influence of one or 
two frameshift mutations in the vector and one or two frameshift 
mutations in the fragment were tested using vectors Blue 425, 426 
and 428 (one mutation) and vectors Blue 442, Blue 443 (two 
mutations) and fragments 442 and 443 (two frameshift mutations) 
and fragments 424, 425, 426, 427, 428 (one mutation) and wild-type 
(no mutation) 

The vectors 31ue 442 and 443 are double frameshift mutations: Blue 
442 = 428 + 429 and blue 4 4 3 = 427 + 4*29 (see Figure 7). 




..to--- w - 



i--S . 



The result c f the test is s h o v r. ir. 7 -.z.e 2 A a r; z ? a c 1 e 23 



Vector * i Suroer of 


* c f colonies with a c t i ve 


1. Blue 425 - j 142 1 15% 
4 4 2 j | 


1 2. Blue 425 - ! 144 
4 4 3 j 




3 . Blue 426 + j 42 
4 42 | 


42% 


4. Blue 426 + i 77 
443# j 


20% 


5. Blue 423 + 115 
443 


3.3% 



One frameshift mutation on the vector and two on the fragment on 
each side of the opening site. = determined by 6 plates. 



Table 23 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


Blue 442 + 424 


137 


0.5% 


31ue 442 + 426 


118 


1.1% 


Blue 442 + 427# 


125 


1.3% 


Blue 443 + 425 


540 


2.51 : 
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Blue 443 + 426 


196 


1.5% 


Blue 443 + 428 


469 


3.1% 


Blue 442 + wt 
fragment 


135 


7.7% 


Blue 443 + wt 
fragment 


488 


10.7% 



Two frameshift mutations on the vector on each side of the opening 
site and one on the fragment. # determined by 6 plates. 



Table 2A shows a rather high number of colonies with lipase 
activity even with a total of 3 frameshifts (but only one 
frameshift on the vector) except for -the last row where the 
frameshift on the vector is located far from the opening site. Lane 
4 has fewer actives than lane 2 probaoiy due to that the frameshift 
on the vector is located further away from the opening site than 
the frameshift on the fragment making the active genes mosaics that 
are not related to the cpening site see figure 2A! . In Table 23 a 
very low number of actives are :c5e:vv d when there are 2 
frameshifts located cn ::.e ve:::: . Xcst z f these arrive colonies 
are mosaics of the "parent" CNA meaning that the mixing is net 
related to the opening site {see figure 23) . 

Recombination with two different vectors or fragments 

The result of recombination with two cifferent vectors or 
fragnments the test is shown in Table 2 



Table 2 



Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


3iue 428/pstI -r 


13 


15% 


3iue 429/pst # 






Blue428/pst + 3iue 
429/PstI + 442 


273 


4.2% 


_31ue 442/pstI + 428 + 
429 


22B 


0.8% 


Blue 443/pstI + 427 + 
428 


229 


1.6% 



Recombinations with 2 different vectors or fragments. # Determined 
by 1 plate. 



A low number of colonies are seen for the control experiment in row 
1 of table 3 as expected. The fragment added in the middle row has 
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two frameshifts each corresponding to the frameshift on each 
vector. Via a tripartite recombination 4.2% actives are created. 
With two fragments with each one frameshifts and a vector with the 
same two frameshifts very few actives are found. 

Recombination with vectors opened at different sites 

Opening the vector in one side instead of approximately in the 
middle still gives good recombination as shown in Table 4. Two 
vectors opened at different sites can also recombine to some extent 
(compare with the vector controls in table 13). 



Vector * F r a c m e r. ' 


i 

i 


Number : : 
colonies 


1 1 ^ 

i 


- colonies with active 
Lipoiase 


Blue 4 2 3 / x h o - 42 


5 ! 


1 c 1 


1 1 % 


Blue 42 8/xhc-31u£ 
429/pstr? 


1 




5.3* 


d e t e r m i r. e d b v 6 c 1 a t 


es . 


~." c i c e i ~ stead * c 


i r. tr.e micdi e . •? 


Recombina t icr. at dif 




z on ce n t r a t i c n s of 


vector and fragment 


The relative cor. cent 


r a 1 1 c r. 




"rag:: 


ent do influence the 



percentage of rtsitive color. i e s as oar. be seen ir Table 5. 



Table 5 



Vector * Fragment 


Number of 
colonies 


1 of colonies with lipase 
activity 


0 . 5ul 31ue 425 + 
3ui 4 42 


42 


42% 


1 . 5ui Blue 426 r 
3ul 442 


21 


51% 


1.5ui Blue 426 + 
9ul 442 


34 


26% 


l.Sul Blue 426 + 
3ul 427 


230 


2.8% 


1^1 Blue 442 + lu.1 


224 


1.16% 


lul Blue 442 + 2ul 


429 


0.9% 
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425 

lul Blue 442 + 4ul 
425 



lMl Blue 442 + 8ul 
425 



434 



1.6% 



ljU Blue 442 + 16ul 
425 




varying the concentration of the vector or fragment 



Recombination with fra gments of different size 

The size of the fragment also influences the recordation ~esul^ 
as seen in Table €. ' su1l 



10 



Vsctsr - ::a;-,., : i 

! 


.V umb e r c 


* Of CDlo 


"iss v i t h a 
- ? o 1 a s e 


= ". iV e | 

! 


Blue 4 24 - 4~- j 

(2 50b ? : ! 


7 3 




3 4 % 


— 1 


(4B9bp; 1 


13 3 ; 




45% 


\ 



Blue 424 -r 424 
f •! SObp) 


0.3% 


Blue 424 - 42S 
(4B0bp; 

Blue 42 ?, - j ✓ = 1 


1 „ — 

~ " " 1 J C % 



(4E0bo; 



3iue 425 + 425 
(480bp) 



^combination 



25% 



;i:n smaner fragments than 900 




DO . 



Recombination wi r-v, unoo—*- vs ^_ 

Transformation with unopened vector* 
recombination (Table 7) . 



shows a very low degree of 



Table 7 



Plasmid 



Number of 
colonies 



Blue 426 + Blue 
429 



887 



% of colonies with active 
Lipolase 



0.3% 
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Blue 426 + Blue 


697 


0.7% 


425 







Recombination of unopened plasmids 



10 



Example 4 

Test of 5. cerevisiae mutants altered in recombination 

Using the same approach as described in Example 3 recombination of 
opened and unopened vectors and fragments were tested using a 
Saccharomyces cerevisiae rad52 mutant as the recombination host 
cell. The result is displayed in Table 8. 

Table S 



Vector + 
Fr a gme n t 


Number of 
colonies 


% of colonies with active 
Lipoiase 


Blue 42 3 '+ 4 29 


o 


C 


3iue 442 - 427 


0 


C 


Blue 424 + 425 


c 


n 


3iue 426 + 443 


o 1 




Piasrr.id dJSO 

37 ' i 


544 


100% 



Recombination result :r. rad52 ~utan_ . 

15 The result with rac52' shoved that re combination was completely 
abolished. The ? 35 2 function is required for classical 
recombination (but rot for unequal sister-strand mitotic 
re comb i nation) showir.c that the re comb in at ion of opened v e::o: and 
fragment could involve a classical recombination mechanise.. 

20 



25 



30 



Example 6 

Recombination of multiple partial overlapping fragments 

In order to increase the mixing of the mutations by the 
recombination method of the invention, recombination of t\ 
fragments and one gapped vector were attempted. 

Table 15 



two 



Vector + Fragment 


Number of 


% of colonies with lipase 




colonies 


* activity 


1. pJS037/HindIII-XhoI 


> 2000 


100% 


+ PCR319+PCR327 






2. DJS037/HindIII-XhoI 


* 2000 


* 0.2% 


"+ PCR321+PCR331 







WO 97/07205 



PO7DK96/00343 



38 



3. pJS037/HindIII-XhoI 
+ PCR319+PCR331 


* 1500 


* 1% 


4. pJS037/HindIII-XhoI 
+ PCR319+PCR386 


> 5000 


> 90% 


5. pJS037/HindIII-XhoI 
+ PCR321+PCR386 


> 5000 


* 25% 


6. Blue 428/HindIII- 
Xhol + PCR321+PCR331 


400 


0.2% 


7. Blue 428/HindIII- 
Xhol + PCR319+PCR327 


= 1500 


-----_> 90% 


8. Blue 428/HindIII- 
Xhol + PCR321+PCR327 


= 150 


* 10% 


9. Blue 423/HindIII- 
Xhol + PCR327 + PCR33 5 


= 1500 


= 10% 


10. Blue 429/HindIII- j = 400 
Xhol - ?:R319-?C?.3S 5 ! 


* 15% 


11. Blue 4 2 5/Hir.di::- ' = 2 50 | * 15% 
Xhol * ?CR221-?CR35 6 \ 


12. 31ue 442/Hi-ci::- , = 1500 j * 10% 

~ - - ~* w - - ■ ■ j 


Xhol - ; 


0 


14. Blue 429/H-ndi::- ■ C 
Xhol - 


0 


15. 31ue 4 4 2 /Hir.ii::- • 6 1 0 
Xhol + i j 


15. Blue 4 23 /Hir.di: I- ; 4 j' 0 
Xhol t PCR3 31 ! j 


17. 31ue 423/HindIi:- j 2 
Xhol * PCR321 | 


0 



Recombination result - f two f ragmen t s and a gapped vector. The last 



5 rows are controls. 

As can be seen in Table 15, the recovery of the Humicoia lanuginosa 
5 lipase gene is very efficient. The last 5 rows in Table 15 shows 

that the opened vector alone or with only one fragment not covering 
the whole gap (see figure 3) gives only very few colonies. 

The first row is with wild type fragments gives 100% of active 
10 colonies. 

The second row is with two fragments each containing a frameshift. 
The fragment PCR331 fragment has the frameshift located at the 
Bglll site which, in this recombination, is not covered by a wild 
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type fragment (see figure 3) and therefore gives about 0% of active 
lipase. The same is the case for row .3 and 6. 

In the row 4 t fragment PCR386 containing a frameshift at the SphI 
site which is overlapped by wild type sequences in the gapped 
vector. The frameshift was recombined into less than 10% of the 
genes which is lower than the result for one fragment recombination 
in the last row of Table 1A above. 

In row 5 a rather high mixing is observed between the 2 fragments 
each containing a frameshift and thewiid type gapped vector giving 
25% active and 75^ inactive lipase "colonies . This is probably due 
to that the fragment PCR32 1 has the frameshift in the overlap 
between the 2 fragments and ir. the capped region of the vector. If 

fragment ?C?.35£ contributes tc 10% ;:a:;:ves like in row 4 , 
fragment ? Z R 321 ;:ves the remaining f : \ inactives - therefore 
PCR335 gives 35* w- :r . the overlap. 

:h the frameshift at tne 
and 2 wild type fragments 
fragment into more than 90% 

Row £ shews like m row 5 that the frameshift of PCR321 in the 
overlap and gap re;ior. gives a very hign number of inactive. 

In row 9, fragment PCR33 3 with a frameshift in the vector overlap, 
causes a very high number of inactives. 

Row 10 gives a rather high- number of inactives compared to row 7 
and 4. It is not increased in row 11. 



Row , i 5 the "mirror image" of row a 
SphI site on the vector [see Figure ~ 
giving an integration of tne wild type 
of the vectors . 



Row 12 shows that two frameshifts on the vector gives a 1 
number of actives compared to one in row 7. 



The recombination of 3 partial overlapping fragments into a gapped 
vector is also very efficient as seen in Table 16. The last row 
with the vector alone gives very few colonies. As can be seen in 
figure 4 all fragments used are wt. In the first row in table 16, 
there are rather long overlaps between the vector and fragments, 
but in the middle row the overlap between PCR353 and 355 is only 1 
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bp long and it is still very efficiently recombined' This 
surprising result may be utilized for very easy domain shu , flina of 
even distantly related genes. For example can 3 different drains 
rrom 10 different genes be made as PGR fragments, designed to have 
a 10 to 20 bo overlap by primer design and recombined togeth-r and 
subsequently screened for the best combination (1000 possible 
combinations J . 



Vector r Fragment 



pjS037/PvuII-SpeI + 
FC?.353r?CR354-?CR3€7 



Table Id 



Number of 
colonies 



> 5000 



* of colonies with active 
Lipolase 



100 5 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: Novo Nordisk A/S 

(B) STREET: Novo Alle 

(C) CITY: Bagsvaerd 
10 (E) COUNTRY: Denmark 

( F) POSTAL CODE (ZIP): DK-2880 

(G) TELEPHONE: +45 4 4 4 4 8888 

(H) TELEFAX: +45 4449 3256 

(ii) TITLE OF INVENTION: Method for preparing polypeptide variants 
15 (iii) NUMBER OF SEQUENCES: 15 

(iv) (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk ' 
(3) COMPUTER: I3M PC compatible 
(C) OPERATING SYSTEM : PC- DOS /MS -DOS 
20 (D) SOFTWARE: Patenter. Release =1.C, Version = I . 30B (EPO; 

[2) infcpmation fcr se: :d n: : i: 
25 se:*:ence cha=acteri st : cs : 

;a: LENGTH: 2? case pairs 
,3, TYPE: :u:le;: acic 

•:; st?an:e:sess : si-pie 

■D} TOPOLCGY: linear 
30 !ii! MOLECULE TYPE: er acic 

(A) CESCRIPTICN : '-esc - "Pnrer 2243" 

(xi- sequence cescripticn: se: ID NO: 1: 

ACA.AAC.ATTA CGTGCACGCC 20 

35 

;2) information fcr se; :: nc: : = 
;.i sequence characteristics: 

(A) LENCTH: IE case pairs 
40 ; = ; TYPE: :u:l5;: a::: 

;C) STRANCECNESS : simple 

{ D) TOPOLOGY: linear 
{11} MOLECULE TYPE: ctr.er nucleic acic 

(A) DESCRIPTION: /cesc = "Prirr.er 4599" 
45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CGGTACCCGG GGATCCAC 13 
(2) INFORMATION FOR SEQ ID NO: 2: 

50 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acic 
(C) STRANDEDNESS : sincie 
5 5 (D) TOPOLOGY: linear 

.(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc = "Primer 5164" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

60 AATTACATCA TGCGGCCC 18 
(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : sir.cle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 8487" 
{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CATTTGCTCC GGCTGCAGGG A 21 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base oairs " *""" 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc = "Primer 4548" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTGTTCCGCC GG7C7G7ACG G7CAGGAA77 C7GCAAAA ZZ 

uC.*o. .TCCG ACTCGGGGGG 50 
;2 ; IN' FORMAT ION FOR SEQ 13 NO: 6: 

( 5 ) 7 Y ? E : nucleic a:':: 

:C) STRANDEDNESS: sir.cle 

{ D) TOPOLOGY: linear 
.; i i ) MOLECULE 7Y?E: other r.ucleic acic 

(A) DESCRIPTION : /cesc - "Prirr.er 5576" 
(xi) SEQUENCE DESCRI ?7 ICN" : SEI ID NO: 6: 

o vj i iw : vj l r.\- o G i C .-.G G AA7 7 C 21 

(A) LENGTH: 19 base cairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : smcie 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "?ri~er 5578" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: T : 

CGT77CGGGT GACGGGGAC 13 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 1596" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGAGCAAATG TCATTTAT 18 
(2) INFORMATION FOR SEQ ID NO: S: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "Primer 4545" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCATTGGCAA CTGTTGCCGG AGCAGACCTG CGTGGAAATG 

GGTATGATAT CGACGTGTTT TCAT 64 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : circular 




(xi) SEQUENCE INSCRIPTION: SEQ 17 NC : 1j: 

A.TG AGG A Z Z 7CC C77 ZZ Z ZZ Z Z Z Z Z Z Z G.^ . 1 ., j GCC TTG 4 6 

Met Arc Ser 5er leu V a 1 leu Phe ?he Val Ser Ala 7 re Thr Ala Leu 

I 5 10 15 

GCC ACT C C 7 A 7 7 C C 7 Z Z A G A C 777 7 C G C AG C A . C - G 777 AAC C AG T T C 95 

Ala Ser Pro He Arc Arc Glu Val Ser Gin Aso leu Phe Asn Gin Phe 

20 ' 25 30 

AA 7 C7C 777 G 7 A C A Z 7A7 7C7 G C A GCC GCA . ,-.C iGC Go A AAA AAC A A. 7 144 

Asn Leu Phe Ala Glr. 7 y r Ser Ala Ala Aia 7 y r Cys Gly Lys Asn Asn 

3 5 " 4 0 4 5 

GAT GCC CCA GC7 GG7 AC A AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 192 

Aso Aia Pro Ala Glv Thr Asn He Thr Cys Thr Gly Asn Ala Cvs Pro 

50 * 55 60 

GAG GTA GAG AAG GCG GAT GCA ACG TTT C7C 7 AC TCG TTT GAA GAC TCT 2 40 

Glu Val Glu Lvs Aia Aso Ala Thr Phe Leu 7yr Ser Phe Glu Asp Ser 

65 ' 10 75 * 80 



GGA GTG GGC GAT GTC ACC GGC TTC CT7 GCT CTC GAC AAC ACG AAC AAA 283 

Gly Val Gly Asd Val Thr Gly Phe Leu Ala leu Aso Asn Thr Asr: Lys 

65 50 95 

TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAG AAC TGG ATC 336 

Leu lie Val Leu Ser Phe Arg Gly Ser Arc Ser He Glu Asn Trp He 

100 ' 105 110 

GGG AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC 38 4 

Gly Asn Leu. Asn Phe Aso Leu Lys Glu lie Asn Asp He Cys Ser Gly 

115 * 120 125 
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TGC AGG GGA CAT GAC GGC TTC ACT TCG TCC TGG AGG TCT GTA GCr rt~ 
Cys Arg Gly His Asp Glv Phe Thr Ser S»r Tm a™ i-i G f C GAl "2 



10 



? rg Gly His Asp Gly Phe Thr Ser Ser Tro Arg Ser Vat S £J 

ACG TTA AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC tat 
Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu H? s Pro Sp ?JI 

160 

CGC GTG GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTG GCA APT ptt ' 

Arg v.1 Val Phe Thr Gly His Ser Leu Gly Gly .Ala Lu S Th" Sal 528 



Ala Civ ffT ? AC F G ° GT GGA ^ T GGG TAT GAT ATC'GAC GTG TTT TCA 
l5 Ala Gly Ala ? sp Leu Arg Gly Asn Gly Tyr Asp lie Asp -^ai Jhe E 

185 190 

tC'3 rT° E CC CGA GTC GGA AAC A GG GCT TTT GCA GAA TTC CTG Arr 

Tyr Gly Ala Pro Arg Val Gly Asa Arg Ala.. Phe Ala Glu J£ 2S ?S 

20 * " 00 • 205 



25 



- i ?S ??= Sij ?S 5; £ K - £ - - - - - 

^ i: > 220 

val ^ Ar5 Ul ?~ Ilf 5?" ?' : ™C -CT A3C CCA 

225 ' u " ie - His Ser Ser Pro 

z ^ j * ~ _ 

240 



30 g'i-j ~ v r ^: ^ A :* c: ?~~ — ™ a:c cga ^g gat 753 

- - r .. f ;e. .-r .eu .a. ?rc Val 7hr a*- &3 

* » 3 AAG 



35 



ii| ^ ^ '|: Val Thr Arg jg Asp 

t:« :--c ?: w 0 " AAC CAG CC~ 

- S 2 ^ * iS As ? A :! 7 ^ -X Gly Asn As, Gin Pro 

270 

aJh ill f:I ?? G <*= f- A I GG * A ~ ess --a a-t ggg 

27= = C-i? ^ ?r? ? ^ - S G ^ *-eu lie Gly 



45 



50 



!ii) MOLECULE TYPE: orocei- 
55 (X1) SEQUENCE DESCRIPTION : SEQ ID NO: li: 

HS l ArS SCr Ser Le £ Val Le ^ ?he ? ^ V.l Ser Ala Trp Thr Ala Leu 



15 



60 



65 



Ala Ser Fro lie Arg Arg Glu Val Ser Gin Asp Leu Phe Asp. Gin Ph» 

20 30 

Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 

^ y 4 5 

Asp Ala Pro Ala Gly Thr Asr, lie Thr Cys Thr Gly Asn Ala Cys Pro 

DD 60 



4S0 



576 



624 



672 



720 



316 



35^ 



290 

(2) :nfo?^-.t:o:: rep. s£q :g n 0: 11: 

(i) SZQUZXCZ CHAP^CTERISTICS- 
(A) LIVJGTH: 292 a^ino acics 
(E) TYPE ; a-ino acic 
(D) TOPCLCGY: linear 
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Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Giu Asp Ser 
65 70 75 80 

Glv Val Gly Asd Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
5 * 85 90 95 

Leu lie Val Leu Ser Phe Arg Gly Ser Arg Ser lie Glu Asn Trp lie 
100 105 110 

10 Gly Asn Leu Asn Phe Asp Leu Lys Glu He Asn Asp lie Cys Ser Gly 
115 120 12 = 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser_VaJ._Ala Asp 
130 135 I 40 

T k- L o U Arg Gin Lys Val Glu Asp Ala Vai Arg Glu His Pro Asp Tyr 
145 150 I 55 160 

A-c Val Val Phe Thr Gly His Ser Leu Giy Gly Ala Leu Ala Thr Val 
20 "' 165 i 70 175 

Ala Gly Ala Asp Leu Arc Gly A sr. Gly Tyr Asp lie Asp Val Phe Ser 



15 



30 



45 



60 



65 



1 s n 



2 5 Tvr Giv Ala Pro Arc vai Giy Asr. Arc Ala ?r.s Ala Giu ?.-.e Leu 
" iss. ' 2 00 20; 



Val G 1 n Thr Giy Gly T.-.r Leu Tyr Arc :ie Tr.r His Thr A3, As? lie 
210 21: 2^0 



a e r "15 



Vai Pro Arc Leu Pro Pro Arg ._ . 
225 2u0 Z ^ J 

G<u Tvr Tro lie Lvs Ser Giv Thr Leu Vai Pro Vai Thr Arg Asn Asp 

35 " * 2i5 ' 25C 255 

lie Vai Lvs lie Giu Giv lie Asp Aia Thr Giy Giy Asn Asn Gin Pro 
260 ' * 255 210 

^0 Asn lie Pro Aso lie Pro Aia His Leu Trp Tyr Phe Giy Leu lie Giy 

275 ' 250 25d 



Cvs Lei 
250 



( 2 : I N FORMAT I ON FOR 5EQ ID NO: 12 



(i ) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 676 base pairs 

(3) TYPE: nucieic acid 
< C ) STRANDEDNESS : single 
(D) TOPOLOGY: circular 

55 (11 ) MOLECULE TYPE: other nuoieic acic 

(A) DESCRIPTION: /desc - "Vector pJS037' 



(vi) ORIGINAL SOURCE: 

(3) STRAIN: Ku^icola lanuginosa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
.(B) LOCATION : 1 . .67 6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



35 40 

GAT GCC CCA GCA GGT ACA AAC ATT ACG TGC ACG GGA AAT GCA TGC rrr 
is Asp Ala Pro Ala Gly Thr Asn lie Thr Cys Thr Gly Ask £Uys Pro 

GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 
Glu Val Giu Lys Ala Asp Ala Thr Phe Leu Tyr Ser ?he SJ Aso Sel 
20 • 75 "80 

GGA GTG GGC GAT GTC ACC GGC ~~~ r~~ r,r-r — r * r --- . , 

Gly Val Gly Asp Val Thr civ Pne Leu Ala \% ^ ^ ™ 

e - 90 ?5 ' 

^ ?I= 5;f fig ^ g= •=■; ™ ~= ;f= «f 

: C i c = , , 

* ^ - * 1 - 

o'oo 7 C77 AA"" .^z^ — — - * * * r ' ~ . . ~ , ^ „ «„_ „^ 

J0 " 4y & Asn ?ns As? ~ s " ^ asp rie c y I s:: ^ 



125 



35 



iGC AGG GGA CA7 GAG GGC ' ~~r ~ ~ ~ * -~ . . ^„ 

o» j. 3 «, Sis , !? „, ;.; yr: ., .... „ £ « «j 

| ;" : K £ Sir si; Si SII j« - «; - - - 

40 " 3 160 

2c 51? W III A :: ^ CA I ~ : CTT =f 5 GT G - G c - gca act g-t 

- - - er - — G1 / ---ia Leu Ala v a > 

1== 170 i7 ; " 

40 - 3A ?? A GAC C?S CGT GZk ~ T GGG TAT GAT ATC GAC GTG T^A 

.-ia ,,y ,1a Asp _. a Arc Gly Asn Gly Tyr Asp lie Asp Val ?h S 

1SL 1S3 19C 

c n J AT G ? C GCC ccc CGA GTC GGT AAC CGT GCT TTT GCA GAA rrr jrr 

50 :yr Gly A a Pro Arg Val Gly Asn Arg Ala Phe Ala ^ ^ 

* 200 205 

GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AA" GAT * T ~ 

55 5 S Giy Thr ^ Ar? Iie T ^ His ?hr ^ a2o 

ii3 220 

GTC CCT AGA CTC CCG CCT CGA GAA TTC GGT TAC AGC CAT t C t k r.r era 

Val Pro Arg Leu Pro Pro Arg Gi, Phe Gly Tyr Ser H?s Ser Se- Pro 

60 • 230 235 240 

GAG TAG TGG ATC AAA TCT GGA ACA CTA GTC CCC GTC ACC CGA AAC cat 

Glu :yr Trp He Lys Ser Gly Thr Leu Val Pro Val Thr Arg £ ^ 
245 250 255 

65 ATC GTG AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAr rrr 
He Val Lys lie Glu Gly He Asp Ala Thr Gly Gly HI £J 

265 270 



48 
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ATG AGG AGC'TCC CTT GTG CTG TTC TTT GTC TCT GCG TGG ACG GCC rrr 
Met Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
5 10 15 

5 GCC AGT CCT ATA CGT AGA GAG GTC TCG CAG GAT CTG TTT AAC TAP rrr 

Ala Ser Pro lie Arg Arg Giu Val Ser Gin Asp Leu III cfn III * 
ZQ 25 30 

AAT CTC TTT GCA CAG TAT TCA GCT GCC GCA TAC TGC GGA AAA AAC AAT 
10 Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly £JJ ^ 144 



192 



240 



236 



j3o 



33 4 



432 



430 



52 = 



576 



624 



67; 



720 



768 



816 
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AAC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAG TTC GGG TTA ATT GGG 
Asn lie Pro Asd lie Pro Ala Mis Leu Tro Tyr Phe Gly Leu He GW 
275 280 * 285 

ACA TGT CTT TAG 
Thr Cys Leu 
290 



{2} INFORMATION FOR SZQ ID NO: 13; 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 92 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

Cxi) MOLECULE TYPE: orotein 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO*: 13: 

Met Arg Ser Ser Leu "a I Leu Phe Phe Val Sir Ala Tro Thr A^ a * <=><• 
^ 5 10 ' is 

Ala Ser Pro He Arc Arc Giu Val Ser Gin Aic Leu Phe Asn Gin 
2: 25 30 

Asn Leu Phe Ala Gin Tvr Ser Ala Ala Ala T.r Z;$ Giv Lvs Asr As~ 
25 ' 4 1 4 5 * 

Asp Ala f:: Ala Glv Tnr Asn He Tnr Lvs T~: Glv Asn Ala Cys - 
S3 35 50 

Giu Val Giu Lys Ala Asp Ala Tr.r Phe Leu Tyr Ser Phe Giu' Aso Ser 
65 70 "5 * so 

Gly Val Gly Asp Val Tnr Glv Phe Leu Ala Leu Aso Asn Thr Asn Lvs 
E5 * 90 95 " 

Leu He Val Leu Ser Pr.e Arc Glv Ser Arg Ser He Giu Asn Tro li* 
100 105 110 

Gly Asn Leu Asn Phe Aso Leu Lvs Giu He Asn Aso He Cvs Ser Giv 
U5 ' 120 * ■ 125 " 

Cys Arg Gly His Aso Glv Phe Tnr Ser Ser Tro Arg Ser Val Ala Aso 
130 135 ' 140 

Thr Leu Arg Gin Lys Val Giu Aso Ala Val Arc Giu His Pro Aso Tyr 
145 150 155 * 160 

Arg Val Val Phe Thr Gly His Ser Leu Giv Giv Ala Leu Ala Thr Val 
165 170 175 

Ala Gly Ala Asp Leu Arc Giv Asn Giv Tvr Aso lie Aso Val Phe S°r 
180 1S5 * " 190 

Tyr Gly Ala Pre Arc Val Giv Asn Arc Ala Phe Ala Giu Phe Leu Thr 
195 200 205 

Val Gin Thr Gly Giv Thr Leu Tyr Arg He Thr His Thr Asn Asp He 
210 215 220 

Val Pro Arg Leu Pro Pro Arg Giu Phe Gly Tyr Ser His Ser Ser Pro 
225 230 235 240 

Giu Tyr Trp He Lvs Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 
245 250 255 
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He Vai Lys lie Glu Gly He Asd Ala Thr Gly Giv Asn Asn Gin Pro 

260 * 265 270 

5 Asn He Pro Aso He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly 

275 " 280 285 



10 



Thr Cys Leu 
290 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 864 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DMA (genomic) 

{ v i ) ORIGINAL SOURCE: 

iz) STRAIN: Pseucc-cnas sp. 

2 5 ;i>:; "EAT V RE : 

;a: NAME/KEY: .T.a t_cep-ice 

:= location :..=£■■; 

:i>;; FEATVRE : 

30 ' a ' naxe/key: cgs 

• = : LOCATION: I . . = 5 4 



UO 



55 



(xi) S EQTENCE DESCRIPTION: SET IT NO: 14: 

TTC GGC TCG T G G AA. C TAT ACT AA 3 ACT - .-. j ; ,~. G C ~ A . « G T C C T G ACT 43 

Phe Glv Ser Ser Asn Tvr Thr Lvs Thr Gin Tyr ?::■ He Vai Leu Thr 

i 5 10 15 



4 0 CAC GGC ATG CTC GGT TTC GAG AGE CTG GTT GGA GTG GAG TAC TGG TAG 96 

.eu Leu Giv 7a 1 Aso Tvr 
2 5 * 30 



His Glv Me: Leu Giv ?.-.e Aso Ser Leu Leu Gly Val Asp Tyr Trp Tyr 

3C 



GGC ATT CCG TGA GCC GTG GGT AAA GAG GGC GCC AGC GTC TAC GTC ACC 14 4 

4 5 Glv He Pre Ser Ala Leu Arc Lvs Asp Gly Ala Thr Vai Tyr Val Thr 
35 40 45 



GAA GTC AGG CAG CTC GAG ACC TCC GAA GGC CGA GGT GAG CAA CTG CTG 192 
Glu Vai Ser Gin Leu Aso Thr Ser Glu Ala Arg Giv Glu Gin Leu Leu 
50 50 * 55 60 



ACC CAA GTC GAG GAA ATC GTG GCC ATC AGC GGC AAG CCC AA.G GTC AAC 24 0 

Thr Gin Val Glu Glu He Val Ala He Ser Gly Lys Pro Lys Vai Asn 
65 70 75 30 

CTG TTC GGG CAC AGC CAT GGC GGG CCT AGC ATC CGG TAC GTT GCC GCC 288 
Leu Phe Giv His Ser His Gly Gly - Pro Thr He Arg Tyr Val Ala Ala 
85 90 95 



60 GTG CGC CCG GAT CTG GTC GCC TCG GTC ACC AGC ATT GGC GCG CCG CAC 33 6 

Vai Arc Pro Aso Leu Val Ala Ser Val Thr Ser lie Gly Ala Pro His 
100 105 110 

AAG GGT TCG GCC ACC GCC GAC TTC ATC CGC CAG GTG CCG GAA GGA TCG 38 4 

65 Lvs Giv Ser Ala Thr Ala Asp Phe He Arg Gin Vai Pro Glu Gly Ser 
11: 120 125 
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GCC AGC GAA GCG ATT CTG GCC GGG ATC GTC AAT GGT CTG GGT GCG CTG 4 32 

Ala Ser Glu Ala lie Leu Ala Giy He Val Asn Gly Leu Gly Ala Leu 
130 135 140 

ATC AAC TTC CTT TCC GGC AGC AGT TCG GAC ACC CCA CAG AAC TCG CTG 4 80 

He Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 
145 150 155 160 

GGC ACG CTG GAG TCA CTG AAC TCC GAA GGC GCC GCA CGG TTT AAC GCC 526 
Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Phe Asn Ala 
165 170 175 

CGC TTC CCC CAG GGG GTA CCA ACC AGC GCC TGC GGC GAGJ3GZ GAT TAC 57 6 

Arg Phe Pro Gin Gly Vai Pro Thr Ser Ala Cys Gly Glu £ly~"Asp Tyr 
180 185 190 

GTG GTC AAT GGC GTG CGC TAT TAC TCC TGG AGG GGC ACC AGC CCG CTG 62 4 

Val Val Asn Gly Val Arc Tyr Tyr Ser Trp.Arg Gly Thr Ser Pro Leu 
195 ' 200 ' 205 

ACC AAC GTA CTC GAC CCC TCC GAC CTG CTG CTC GGC GCC ACC TCC CTG 67 2 

Thr Asn Val Leu Aso Pro Ser Aso Leu Leu Leu Glv Aia Thr Ser Leu 
210 215 220 

ACC TTC GGT TTC GAG GCC AAC GAT GGT CC . C GGA CGG TGG AGC TGG 720 
Thr Phe Glv Phe Glu Ala A sr. Aso Glv Leu Val Glv Arg Cvs Ser Ser 

225 22: ;:*= * 24: 

CGG CTG GGT ATG GTG ATC GGC GAC AAC TAG ~_-G ATG AA.C GAG GTG GAG 7 5; 

Arg Leu Giv Met Val He Arc Aso Asr. Tvr Arg :■ ■ e : Asn His Leu Aso 
245 ' 25C 255 

GAG G.G AAC CAG AGG TTG GGG CTG AGC Aj. .-.T-„ TTC GAG- AGG' AGC CCG S15 
Glu Val Asn Gin Tr.r Phe Glv Leu Thr Ser He Phe Glu Thr Ser Pro 
250 ' 265 270 

GTA TGG GTG TAT CGG GAC CAA GGG AAT CGG CTG AAG AAC GGG GGG CTC 8 64 

Val Ser Val Tvr Arc Gin Gin Ala Asn Arg Le : Lvs Asn Ala Glv Leu 
275 ' " 252 2S5 

(2) :r:rO?J-LATICN FOR se: :g NO: 13: 

(i) SZQUZXCZ CHARACTERISTICS : 

(A) LENGTH: 23S anmo acids 

(3) TYPE: arr.ino acid 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: orotein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Glv Ser Ser Asn Tyr Thr Lvs Thr Gin Tvr Pro lie Vai Leu Thr 
I 5 10 15 

His Giy Met Leu Gly Phe Aso Ser Leu Leu Giv Vai Aso Tvr Tro Tyr 
20 25 30 

Glv lie Pro Ser Ala Leu Arg Lvs Aso Giy Aia Thr Vai Tvr Val Thr 
35 40 45 

Glu Val Ser Gin Leu Aso Thr Ser Glu Aia Arg Gly Glu Gin Leu Leu 
50 * 55 60 

Thr Gin Val Glu Glu He Val Aia lie Ser Gly Lys Pro Lys Val Asn 
65 70 75 80 

Leu Phe Gly His Ser His Gly Gly Pro Thr He Arg Tyr Vai Ala Ala 
85 90 95 
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Val Arg Pro Asd Leu Val Ala Ser Val Thr Ser He Gly Ala Pro His 
100 105 110 

Lys Gly Ser Ala Thr Ala Asd Phe He Arg Gin Val Pro Glu Gly Ser 
115 " 120 125 

Ala Ser Glu Ala He Leu Ala Glv He Val Asn Gly Leu Gly Ala Leu 
130 135 * 140 

He Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 
145 150 155 160 

Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Phe-Asn Ala 
165 170 175 

Arg Phe Pro Gin Gly Val Pro Thr Ser Ala Cys Gly Glu Gly Asp Tyr 
180 135 . 190 

Val Val Asn Glv Val Arg Tvr Tvr Ser Tro Arc Gly Thr .Ser Pro Leu 
195 " * 200 205 
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PATENT CLAIMS 

1. A method for preparing polypeptide variants by snuff line 
different nucleotide sequences of homologous DNA sequences by i: 
vivo recombination comprising the steps of 

a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid (s) within the DNA sequence (s) 
encoding the polypeptide (s ) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of cr.e polypeptide coding region on at 
least one of the circular plasmic ' s } , d) introducing at least one 
of said opened plasmid {s} , together with at least c r. e of said 
homologous DNA fragment [s] covering full -length DNA seq^er.ces 
encoding said polypeptide [s * :: p^rts thereof, into a re cornel nati or. 
host ceil, 

e) cultivating said recombination t:=t ceil, and 

f) screening for positive polypeptide variants. 

2. The method according to claim. 1, wherein more than one cvcie of 
step a) to f } are performed. 

3. The method according to claims 1 and 2, wherein two or more 
opened piasmids are shuffled with one or more homoiococs DNA 
fragments in the same shuffling cycle. 

A. The method according to any of claims 1 to 3, wherein the opened 
plasmid(s) is (are) gapped. 

5. The method according to any of claims 1 to 4 wherein the ratio 
between the opened piasmid(s) and homologous DNA fragment (s) are in 
the range from 20:1 to 1:50, preferable from 2:1 to 1:10 (mol 
vectorrmol fragments) with the specific concentrations being from 1 
pM to 10 M of the DNA. 

6. The method according to any claims 1 to 5, wherein 2 or more, 
preferably from 2 to 6, especially 2 to 4 of the DNA fragments hava 
partially overlapping regions. 
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7. The method according to claim 6, wherein the overlapping regions 
of the DNA fragments lies in the range from 5 to 5000 bp, preferably 
from 10 bp to 500 bp, especially 10 bp to 100 bp. 

8. The method according to any of claims 1 and 8, wherein at least 
one cycle of step a) to f) is backcrossing with the initially used 
DNA fragments. 

9. The method according to any of claims 1 and" 8, wherein the 
plasmid(s) is (are) opened in the region around the middle of the DNA 
sequence (s) encoding the polypeptide ( s ) . 



10. The method according to any 
plasmid(s) is (are) opened close t o 
e n c od i ng t r.e pel ypep ;:de { s } , 

1 1 * The rr. e t h c c £ zcorcir.r zo a r. v 
fragment (s; prepared ir. szep z . 
suitable for high, medium or low 

12. The method accord::.: z z ar.y 
polypeptides c reducible from the 
proteins with biological activity. 



of claims 1 to 9, wherein the 
a mutation in the DNA secpjer.ee (s } 

wherein the DNA 
under conditions 

cf claims 1 to 11, wherein the 
npuc CNA sequences are e.nzvmes or 




13. The method according to claim 12, wherein the polypeptides are 
en 2 ym.es selected from the group including proteases, lipases, 
cutinases, ceiiuiases, amylases, peroxidases, oxidases and phytases. 

14. The method according to claim 12, wherein the polypeptides are 
proteins with biological activity selected from the group including 
insulin, ACT - , glucagon, somatostatin, somatotropin, thymosin, 
parathyroid hormone, pigmentary hormones, somatomedin, erythro- 
poietin, luteinizing hormone, chorionic gonadotropin, hypothalamic 
releasing factors, antidiuretic hormones, thyroid stimulating 
hormone, relaxin, interferon, thrombopoie tin (TPO) and prolactin. 

15. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi, such as Humicola 
sp., in particular Humicola lanuginosa, especially Humicola 
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lanuginosa. DSM 4109. 

16. The method according to claim 15, wherein at least one of the 
input DNA sequences is selected from the group of vectors (a) to (f) 

5 and/or DNA fragments (g) to (aa) coding for Humicola lanuginosa 
lipase variants. 

17. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 

10 sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi of the genera 
Absidia, Rhizopus, Erne rice 11 a , Aspergillus, Penicillium, 
Eupenicillium, Paecilonyces , Ta laroxyces, Therxoascus and 
Sclerocleista . 

15 

IE. The method according to any of c 1 a r r. s 1 and 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence for wild-type ency~.es, in 

particular lipases, derived from bacteria, such as Pseudc-or.as sp . , 

20 in particular Ps . fragi, Ps . szuzzeri, ?s. cepacia^ Ps . fiuorescer.s f 
Ps. planzarii, Ps . gladioli, Ps . alcaiioenes, Ps . pseudoa lea 1 iger.es , 
Ps . mendoc ina , Ps . aurcginosa , Ps . zri una e , ?s . syrinoae f Ps. 
wiscor.s inens is , or a strain of Bacillus sp., in particular 3. 
svbzilis, B. szearozher- oph iius or or B. pumilus, or or a strain of 

25 Szrepzonyces sp . , in. particular 5. scabies, cr a strain of 
Chroxobacceriu- sp. ir. particular C. viscosum. 

19. The method according to any of claims 1 to 12, wherein at least 
one. of the initially used input DNA sequences is a variant DNA 
30 sequence, such as a DNA sequence coding for a variant enzyme , in 
particular lipase variants, derived from yeasts, such as Candida 
sp., in particular Candida rugosa, or Gsozrichu™ sp. , in Darticuiar 
Geotrichum candidum. 

3 5 20. The method according to any of claims 1 to 19, wherein the 

homologous input DNA sequences are at least 60%, preferably at least 

70%, better more than 80%, esDeciaiiy more than 90%, and even ud to 
100% homologous. 



4 0 21. The method according to any of claims 1 to 20, wherein the 
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recombination host cell is a eukaryotic cell, such as a f ungal ceil 
or a plant cell y eIi 



22. The method according to claim 21, wherein said funeral c-ll 

veast cell fron che grou? of cell Qf Saeeh ~J * 

particular strains of Sscc^yces cerevisiae or 

klvyverx or Schizosaccha retraces sp in "°^s 

Schizosaccharomyces pom >e, or *l„ yC es s P . , 8uch as 

or Htnsenul* sp., in particular H. polymorph, so ^ 

particular P. pascoris, or a filamentous fungi f rom r hp 

-per^us s P „ in particular A. ^ , J^.^ 'X. 0 ' 

or A/eurospora sp., or - U5 ari„, so.,. £', oa-ticula- - 



The -ethod acco: 
pias-id z>; 



. - atitr. serue 



? :: " - - 22, wherein the 

he polypeptides) is. (are) 



s 2 !:.J: e s - :zz:z -- rz -- 5 ?i.s S -d D » A 

' ~~ -^re) operacly Un.ec to a 
^rorr.ocer sequence. 



25. The -e-.hed a:::::; 
expression pias-io. 



-ere- -;- e plasrr.id is an 



26. Tr = ^ , . 

c:_.;rcm; cam. 25, w^>-^~ - - a 

?iasr.,d is PJ5026 =r p'SO." expression 

Title: : ,e: h cd -sr preparing polypeptide va.i^s 
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CGTCGAGAGGTCTCGCAGGATCTGTTTA.ACCAGTTCAATCTCTTTGCACAGTATTC7GCA 
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GCCGCATACTGCGGAAAAAACAATGATGCCCCAGCTGGTACAAACATTACGTGCACGGGA 
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A7GAGGAGC7CCC77G7GC7G77C777G7C7C7GCG7GGACGGCC77GGCCAG7CC7ATA 

5447 — + * + + . 5506 

1 MRSSLVLFFVSAWTALASPI 
SnaBI PvuII 
CGTAGAGAGGTCTCGCAGGATCTCTTTAACCAGTTCAATCTCTT7GCACAGTATTCAGCT 
5507 ~- + + - + + 5566 

21 RREVSQDLFNQFNLFAQYSA 

GCCGCATACTGCGGAAAAAACAATGATGCCCCAGCAGG7ACAAACATTACGTGCACGGGA 

5567 — -r + + 5626 

4i AAYCGKNN DAPAGTNITCTG 
SDhI 

AATGCATGCCCCGAGGTAGAGAAGGCGGATGCAACGTT7CTCTACTCGTTTGAAGACTCT 
5627 + - + + T -V"-„„ + 5686 

6i NACPEVEKADA7FLYSFEDS 

Hindi I I 

GGAGTGGGCGA7G7CACCGGCT7CC77GCTC7CGACAACACGAACAAGCTTATCG7CCTC 
5 63 7 — * . 57 , 5 

£1 G V G D V 7 G F L A * 1 D N 7 N X L 1 V I 

3gIII 

. C 7 .. 7 CCG 7 G G 3 7 7 AAGATCT A 7 .-. G A _ . w . „ w.~.A7 C 7 . C 7 7 C G AC 7 7 G AAA 
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